I have a couple of for loops that I want to vectorize in order to improve performance. They operate on 1 x N matrices.
for y in range(1, len(array[0]) + 1):
array[0, y - 1] = np.floor(np.nanmean(otherArray[0, ((y-1)*3):((y-1)*3+3)]))
for i in range(len(array[0])):
array[0, int((i-1)*L+1)] = otherArray[0, i]
The operations are reliant on the index of the array which is given by the for loop. Is there any way to access the index while using numpy.vectorize so that I can rewrite these as vectorized functions?
First loop:
import numpy as np
array = np.zeros((1, 10))
otherArray = np.arange(30).reshape(1, -1)
print(f'array = \n{array}')
print(f'otherArray = \n{otherArray}')
for y in range(1, len(array[0]) + 1):
array[0, y - 1] = np.floor(np.nanmean(otherArray[0, ((y-1)*3):((y-1)*3+3)]))
print(f'array = \n{array}')
array = np.floor(np.nanmean(otherArray.reshape(-1, 3), axis = 1)).reshape(1, -1)
print(f'array = \n{array}')
output:
array =
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]
otherArray =
[[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
24 25 26 27 28 29]]
array =
[[ 1. 4. 7. 10. 13. 16. 19. 22. 25. 28.]]
array =
[[ 1. 4. 7. 10. 13. 16. 19. 22. 25. 28.]]
Second loop:
array = np.zeros((1, 10))
otherArray = np.arange(10, dtype = float).reshape(1, -1)
L = 1
print(f'array = \n{array}')
print(f'otherArray = \n{otherArray}')
for i in range(len(otherArray[0])):
array[0, int((i-1)*L+1)] = otherArray[0, i]
print(f'array = \n{array}')
array = otherArray
print(f'array = \n{array}')
output:
array =
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]
otherArray =
[[0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]]
array =
[[0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]]
array =
[[0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]]
It looks like in the first loop you are trying to compute a moving average. This is best done like this:
import numpy as np
window_width = 3
arr = np.arange(12)
out = np.floor(np.nanmean(arr.reshape(-1,window_width) ,axis=-1))
print(out)
Regarding your second loop, I have no clue what it does. You are trying to copy values from otherArray to array with some offset? I’d recommend you look at numpy’s slicing functionality.
Related
i have a 2D numpy array. I'm trying to compute the similarities between rows and put it into a similarities array. Is this possible without loop? Thanks for your time!
# ratings.shape = (943, 1682)
arri = np.zeros(943)
arri = np.where(arri == 0)[0]
arrj = np.zeros(943)
arrj = np.where(arrj ==0)[0]
similarities = np.zeros((ratings.shape[0], ratings.shape[0]))
similarities[arri, arrj] = np.abs(ratings[arri]-ratings[arrj])
I want to make a 2D-array similarities in that similarities[i, j] is the differentiation between row i and row j in ratings
[ValueError: shape mismatch: value array of shape (943,1682) could not be broadcast to indexing result of shape (943,)]
[1][1]: https://i.stack.imgur.com/gtst9.png
The problem is how numpy iterates through the array when indexing a two-dimentional array with two arrays.
First some setup:
import numpy;
ratings = numpy.arange(1, 6)
indicesX = numpy.indices((ratings.shape[0],1))[0]
indicesY = numpy.indices((ratings.shape[0],1))[0]
ratings: [1 2 3 4 5]
indicesX: [[0][1][2][3][4]]
indicesY: [[0][1][2][3][4]]
Now lets see what your program produces:
similarities = numpy.zeros((ratings.shape[0], ratings.shape[0]))
similarities[indicesX, indicesY] = numpy.abs(ratings[indicesX]-ratings[0])
similarities:
[[0. 0. 0. 0. 0.]
[0. 1. 0. 0. 0.]
[0. 0. 2. 0. 0.]
[0. 0. 0. 3. 0.]
[0. 0. 0. 0. 4.]]
As you can see, numpy iterates over similarities basically like the following:
for i in range(5):
similarities[indicesX[i], indicesY[i]] = numpy.abs(ratings[i]-ratings[0])
similarities:
[[0. 0. 0. 0. 0.]
[0. 1. 0. 0. 0.]
[0. 0. 2. 0. 0.]
[0. 0. 0. 3. 0.]
[0. 0. 0. 0. 4.]]
Now instead we need indices like the following to iterate through the entire array:
indecesX = [0,1,2,3,4,0,1,2,3,4,0,1,2,3,4,0,1,2,3,4,0,1,2,3,4]
indecesY = [0,0,0,0,0,1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4]
We do that the following:
# Reshape indicesX from (x,1) to (x,). Thats important for numpy.tile().
indicesX = indicesX.reshape(indicesX.shape[0])
indicesX = numpy.tile(indicesX, ratings.shape[0])
indicesY = numpy.repeat(indicesY, ratings.shape[0])
indicesX: [0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4]
indicesY: [0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4]
Perfect! Now just call similarities[indicesX, indicesY] = numpy.abs(ratings[indicesX]-ratings[indicesY]) again and we see:
similarities:
[[0. 1. 2. 3. 4.]
[1. 0. 1. 2. 3.]
[2. 1. 0. 1. 2.]
[3. 2. 1. 0. 1.]
[4. 3. 2. 1. 0.]]
Here the whole code again:
import numpy;
ratings = numpy.arange(1, 6)
indicesX = numpy.indices((ratings.shape[0],1))[0]
indicesY = numpy.indices((ratings.shape[0],1))[0]
similarities = numpy.zeros((ratings.shape[0], ratings.shape[0]))
indicesX = indicesX.reshape(indicesX.shape[0])
indicesX = numpy.tile(indicesX, ratings.shape[0])
indicesY = numpy.repeat(indicesY, ratings.shape[0])
similarities[indicesX, indicesY] = numpy.abs(ratings[indicesX]-ratings[indicesY])
print(similarities)
PS
You commented on your own post to improve it. You should edit your question instead of commenting on it, when you want to improve it.
I have an empty matrix M.shape:
(179, 179)
Now I want to populate it using the following loop:
for game in range(len(games)-1):
df_round = df_games_position[df_games_position['rodada_id'] == games['rodada_id'][game]]
players_home = df_round[df_round['time_id'] == games['time_id'][game]]
players_away = df_round[df_round['time_id'] == games['adversario_id'][game]]
count=0
for j_home in range(len(players_home)):
count_fora=0
for j_away in range(len(players_away)):
score_home = 0
score_away = 0
points_j_home = players_home['points_num'].iloc[j_home]
points_j_away = players_away['points_num'].iloc[j_away]
print ('POINTS HOME',points_j_home)
print ('POINTS AWAY',points_j_away)
soma = points_j_home + points_j_away
if soma != 0:
score_home = points_j_home / soma
score_away = points_j_away / soma
print ('SCORE HOME', score_home)
print ('SCORE AWAY',score_away)
j1 = players_home['Rank'].iloc[j_home].astype('int64')
j2 = players_away['Rank'].iloc[j_away].astype('int64')
print ('j1',j1)
print ('j2',j2)
M[j1,j1] = M[j1,j1] + games['goals_home_norm'][game] + score_home
M[j1,j2] = M[j1,j2] + games['goals_away_norm'][game] + score_away
M[j2,j1] = M[j2,j1] + games['goals_home_norm'][game] + score_home
M[j2,j2] = M[j2,j2] + games['goals_away_norm'][game] + score_away
print (M)
count+=1
print ('COUNT', count)
Finally I get the error:
M[j1,j2] = M[j1,j2] + games['gols_fora_norm'][game] + score_home
IndexError: index 179 is out of bounds for axis 1 with size 179
My last iteration round of prints:
COUNT 3
SCORE HOME 0.0
SCORE AWAY 0.0
j1 7
j2 162
[[0. 0. 0. ... 0. 0. 0. ]
[0. 0. 0. ... 0. 0. 0. ]
[0. 0. 0. ... 0. 0. 0. ]
...
[0. 0. 0. ... 0. 0. 0. ]
[0. 0. 0. ... 0. 0. 0. ]
[0. 0. 0. ... 0. 0. 8.57263145]]
COUNT 4
SCORE HOME 0.0
SCORE AWAY 0.0
j1 7
j2 179
What am I missing?
Matrix is numpy array, and the index for it is start with 0 not 1
np.array([1,2,3,4]).shape
Out[29]: (4,)
np.array([1,2,3,4])[3]
Out[30]: 4
We can simple fix it create the empty M with shape (180,180)
M = M[1:,1:]
I have written some code to identify the connected components in a binary image. I have used recursive depth first search. However, for some images, the Python Recursion Limit is not enough. Even though I increase the limit to the maximum supported limit on my computer, the program still fails for some images. How can I iteratively implement DFS? Or is there any other better solution?
My code:
count=1
height = 4
width = 5
g = np.zeros((height+2,width+2))
w = np.zeros((height+2,width+2))
dx = [-1,0,1,1,1,0,-1,-1]
dy = [1,1,1,0,-1,-1,-1,0]
def dfs(x,y,c):
global w
w[x][y]=c
for i in range(8):
nx = x+dx[i]
ny = y+dy[i]
if g[nx][ny] and not w[nx][ny]:
dfs(nx,ny,c)
def find_connected_components(image):
global count,g
g[1:-1,1:-1]=image
for i in range(1,height+1):
for j in range(1,width+1):
if g[i][j] and not w[i][j]:
dfs(i,j,count)
count+=1
mask1 = np.array([[0,0,0,0,1],[0,1,1,0,1],[0,0,1,0,0],[1,0,0,0,1]])
find_connected_components(mask1)
print mask1
print w[1:-1,1:-1]
Input and Output:
[[0 0 0 0 1]
[0 1 1 0 1]
[0 0 1 0 0]
[1 0 0 0 1]]
[[ 0. 0. 0. 0. 1.]
[ 0. 2. 2. 0. 1.]
[ 0. 0. 2. 0. 0.]
[ 3. 0. 0. 0. 4.]]
Have a list of locations to visit
Use a while loop visiting each location, popping it out of the list as you do.
Like so:
def dfs(x,y,c):
global w
locs = [(x,y,c)]
while locs:
x,y,c = locs.pop()
w[x][y]=c
for i in range(8):
nx = x+dx[i]
ny = y+dy[i]
if g[nx][ny] and not w[nx][ny]:
locs.append((nx, ny, c))
I am working on HMM algorithm.As I am new to python I wanted to know how i can dynamically change the dimension of a matrix using numpy.
I am getting inconsistent output with numpy resize.
1 [[ 1.]]
2 [[ 1. 1.]]
[[ 2. 1.]]
3 [[ 2. 1. 0.]
[ 0. 0. 1.]]
4 [[ 2. 1. 0. 0.]
[ 0. 1. 0. 0.]
[ 0. 0. 0. 1.]]
Here in fourth i should get output as
[[ 2. 1. 0. 0.]
[ 0. 0. 1. 0.]
[ 0. 0. 0. 1.]]
Code :-
with open("SampleFile.tsv") as tsvfile:
tsvreader = csv.reader(tsvfile, delimiter="\t")
wordindex = 0
postagindex = 0
flag = 0
for line in tsvreader:
if len(line) > 0:
if line[1] not in words:
wcount += 1
flag = 1
#wordlikelihoodmatrix.resize(poscount, wcount)
words.append(line[1])
wordindex = words.index(line[1])
else:
wordindex = words.index(line[1])
if line[2] not in poslist:
poscount += 1
flag = 1
#wordlikelihoodmatrix.resize(poscount, wcount)
poslist.append(line[2])
posindex = poslist.index(line[2])
else:
posindex = poslist.index(line[2])
if flag == 1:
wordlikelihoodmatrix.resize(poscount, wcount)
flag = 0
wordlikelihoodmatrix[posindex][wordindex] +=1
print wordlikelihoodmatrix;
Can anybody suggest where i am going wrong while resizing or any other way to get the expected output without using numpy ?
I'm using Theano 0.7 to create a convolutional neural net which uses max-pooling (i.e. shrinking a matrix down by keeping only the local maxima).
In order to "undo" or "reverse" the max-pooling step, one method is to store the locations of the maxima as auxiliary data, then simply recreate the un-pooled data by making a big array of zeros and using those auxiliary locations to place the maxima in their appropriate locations.
Here's how I'm currently doing it:
import numpy as np
import theano
import theano.tensor as T
minibatchsize = 2
numfilters = 3
numsamples = 4
upsampfactor = 5
# HERE is the function that I hope could be improved
def upsamplecode(encoded, auxpos):
shp = encoded.shape
upsampled = T.zeros((shp[0], shp[1], shp[2] * upsampfactor))
for whichitem in range(minibatchsize):
for whichfilt in range(numfilters):
upsampled = T.set_subtensor(upsampled[whichitem, whichfilt, auxpos[whichitem, whichfilt, :]], encoded[whichitem, whichfilt, :])
return upsampled
totalitems = minibatchsize * numfilters * numsamples
code = theano.shared(np.arange(totalitems).reshape((minibatchsize, numfilters, numsamples)))
auxpos = np.arange(totalitems).reshape((minibatchsize, numfilters, numsamples)) % upsampfactor # arbitrary positions within a bin
auxpos += (np.arange(4) * 5).reshape((1,1,-1)) # shifted to the actual temporal bin location
auxpos = theano.shared(auxpos.astype(np.int))
print "code:"
print code.get_value()
print "locations:"
print auxpos.get_value()
get_upsampled = theano.function([], upsamplecode(code, auxpos))
print "the un-pooled data:"
print get_upsampled()
(By the way, in this case I have a 3D tensor, and it's only the third axis that gets max-pooled. People who work with image data might expect to see two dimensions getting max-pooled.)
The output is:
code:
[[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]]
[[12 13 14 15]
[16 17 18 19]
[20 21 22 23]]]
locations:
[[[ 0 6 12 18]
[ 4 5 11 17]
[ 3 9 10 16]]
[[ 2 8 14 15]
[ 1 7 13 19]
[ 0 6 12 18]]]
the un-pooled data:
[[[ 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 2. 0.
0. 0. 0. 0. 3. 0.]
[ 0. 0. 0. 0. 4. 5. 0. 0. 0. 0. 0. 6. 0. 0.
0. 0. 0. 7. 0. 0.]
[ 0. 0. 0. 8. 0. 0. 0. 0. 0. 9. 10. 0. 0. 0.
0. 0. 11. 0. 0. 0.]]
[[ 0. 0. 12. 0. 0. 0. 0. 0. 13. 0. 0. 0. 0. 0.
14. 15. 0. 0. 0. 0.]
[ 0. 16. 0. 0. 0. 0. 0. 17. 0. 0. 0. 0. 0. 18.
0. 0. 0. 0. 0. 19.]
[ 20. 0. 0. 0. 0. 0. 21. 0. 0. 0. 0. 0. 22. 0.
0. 0. 0. 0. 23. 0.]]]
This method works but it's a bottleneck, taking most of my computer's time (I think the set_subtensor calls might imply cpu<->gpu data copying). So: can this be implemented more efficiently?
I suspect there's a way to express this as a single set_subtensor() call which may be faster, but I don't see how to get the tensor indexing to broadcast properly.
UPDATE: I thought of a way of doing it in one call, by working on the flattened tensors:
def upsamplecode2(encoded, auxpos):
shp = encoded.shape
upsampled = T.zeros((shp[0], shp[1], shp[2] * upsampfactor))
add_to_flattened_indices = theano.shared(np.array([ [[(y + z * numfilters) * numsamples * upsampfactor for x in range(numsamples)] for y in range(numfilters)] for z in range(minibatchsize)], dtype=theano.config.floatX).flatten(), name="add_to_flattened_indices")
upsampled = T.set_subtensor(upsampled.flatten()[T.cast(auxpos.flatten() + add_to_flattened_indices, 'int32')], encoded.flatten()).reshape(upsampled.shape)
return upsampled
get_upsampled2 = theano.function([], upsamplecode2(code, auxpos))
print "the un-pooled data v2:"
ups2 = get_upsampled2()
print ups2
However, this is still not good efficiency-wise because when I run this (added on to the end of the above script) I find out that the Cuda libraries can't currently do the integer index manipulation efficiently:
ERROR (theano.gof.opt): Optimization failure due to: local_gpu_advanced_incsubtensor1
ERROR (theano.gof.opt): TRACEBACK:
ERROR (theano.gof.opt): Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/theano/gof/opt.py", line 1493, in process_node
replacements = lopt.transform(node)
File "/usr/local/lib/python2.7/dist-packages/theano/sandbox/cuda/opt.py", line 952, in local_gpu_advanced_incsubtensor1
gpu_y = gpu_from_host(y)
File "/usr/local/lib/python2.7/dist-packages/theano/gof/op.py", line 507, in __call__
node = self.make_node(*inputs, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/theano/sandbox/cuda/basic_ops.py", line 133, in make_node
dtype=x.dtype)()])
File "/usr/local/lib/python2.7/dist-packages/theano/sandbox/cuda/type.py", line 69, in __init__
(self.__class__.__name__, dtype, name))
TypeError: CudaNdarrayType only supports dtype float32 for now. Tried using dtype int64 for variable None
I don't know whether this is faster, but it may be a little more concise. See if it is useful for your case.
import numpy as np
import theano
import theano.tensor as T
minibatchsize = 2
numfilters = 3
numsamples = 4
upsampfactor = 5
totalitems = minibatchsize * numfilters * numsamples
code = np.arange(totalitems).reshape((minibatchsize, numfilters, numsamples))
auxpos = np.arange(totalitems).reshape((minibatchsize, numfilters, numsamples)) % upsampfactor
auxpos += (np.arange(4) * 5).reshape((1,1,-1))
# first in numpy
shp = code.shape
upsampled_np = np.zeros((shp[0], shp[1], shp[2] * upsampfactor))
upsampled_np[np.arange(shp[0]).reshape(-1, 1, 1), np.arange(shp[1]).reshape(1, -1, 1), auxpos] = code
print "numpy output:"
print upsampled_np
# now the same idea in theano
encoded = T.tensor3()
positions = T.tensor3(dtype='int64')
shp = encoded.shape
upsampled = T.zeros((shp[0], shp[1], shp[2] * upsampfactor))
upsampled = T.set_subtensor(upsampled[T.arange(shp[0]).reshape((-1, 1, 1)), T.arange(shp[1]).reshape((1, -1, 1)), positions], encoded)
print "theano output:"
print upsampled.eval({encoded: code, positions: auxpos})