I have written some code to identify the connected components in a binary image. I have used recursive depth first search. However, for some images, the Python Recursion Limit is not enough. Even though I increase the limit to the maximum supported limit on my computer, the program still fails for some images. How can I iteratively implement DFS? Or is there any other better solution?
My code:
count=1
height = 4
width = 5
g = np.zeros((height+2,width+2))
w = np.zeros((height+2,width+2))
dx = [-1,0,1,1,1,0,-1,-1]
dy = [1,1,1,0,-1,-1,-1,0]
def dfs(x,y,c):
global w
w[x][y]=c
for i in range(8):
nx = x+dx[i]
ny = y+dy[i]
if g[nx][ny] and not w[nx][ny]:
dfs(nx,ny,c)
def find_connected_components(image):
global count,g
g[1:-1,1:-1]=image
for i in range(1,height+1):
for j in range(1,width+1):
if g[i][j] and not w[i][j]:
dfs(i,j,count)
count+=1
mask1 = np.array([[0,0,0,0,1],[0,1,1,0,1],[0,0,1,0,0],[1,0,0,0,1]])
find_connected_components(mask1)
print mask1
print w[1:-1,1:-1]
Input and Output:
[[0 0 0 0 1]
[0 1 1 0 1]
[0 0 1 0 0]
[1 0 0 0 1]]
[[ 0. 0. 0. 0. 1.]
[ 0. 2. 2. 0. 1.]
[ 0. 0. 2. 0. 0.]
[ 3. 0. 0. 0. 4.]]
Have a list of locations to visit
Use a while loop visiting each location, popping it out of the list as you do.
Like so:
def dfs(x,y,c):
global w
locs = [(x,y,c)]
while locs:
x,y,c = locs.pop()
w[x][y]=c
for i in range(8):
nx = x+dx[i]
ny = y+dy[i]
if g[nx][ny] and not w[nx][ny]:
locs.append((nx, ny, c))
Related
I have a couple of for loops that I want to vectorize in order to improve performance. They operate on 1 x N matrices.
for y in range(1, len(array[0]) + 1):
array[0, y - 1] = np.floor(np.nanmean(otherArray[0, ((y-1)*3):((y-1)*3+3)]))
for i in range(len(array[0])):
array[0, int((i-1)*L+1)] = otherArray[0, i]
The operations are reliant on the index of the array which is given by the for loop. Is there any way to access the index while using numpy.vectorize so that I can rewrite these as vectorized functions?
First loop:
import numpy as np
array = np.zeros((1, 10))
otherArray = np.arange(30).reshape(1, -1)
print(f'array = \n{array}')
print(f'otherArray = \n{otherArray}')
for y in range(1, len(array[0]) + 1):
array[0, y - 1] = np.floor(np.nanmean(otherArray[0, ((y-1)*3):((y-1)*3+3)]))
print(f'array = \n{array}')
array = np.floor(np.nanmean(otherArray.reshape(-1, 3), axis = 1)).reshape(1, -1)
print(f'array = \n{array}')
output:
array =
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]
otherArray =
[[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
24 25 26 27 28 29]]
array =
[[ 1. 4. 7. 10. 13. 16. 19. 22. 25. 28.]]
array =
[[ 1. 4. 7. 10. 13. 16. 19. 22. 25. 28.]]
Second loop:
array = np.zeros((1, 10))
otherArray = np.arange(10, dtype = float).reshape(1, -1)
L = 1
print(f'array = \n{array}')
print(f'otherArray = \n{otherArray}')
for i in range(len(otherArray[0])):
array[0, int((i-1)*L+1)] = otherArray[0, i]
print(f'array = \n{array}')
array = otherArray
print(f'array = \n{array}')
output:
array =
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]
otherArray =
[[0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]]
array =
[[0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]]
array =
[[0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]]
It looks like in the first loop you are trying to compute a moving average. This is best done like this:
import numpy as np
window_width = 3
arr = np.arange(12)
out = np.floor(np.nanmean(arr.reshape(-1,window_width) ,axis=-1))
print(out)
Regarding your second loop, I have no clue what it does. You are trying to copy values from otherArray to array with some offset? I’d recommend you look at numpy’s slicing functionality.
i have a 2D numpy array. I'm trying to compute the similarities between rows and put it into a similarities array. Is this possible without loop? Thanks for your time!
# ratings.shape = (943, 1682)
arri = np.zeros(943)
arri = np.where(arri == 0)[0]
arrj = np.zeros(943)
arrj = np.where(arrj ==0)[0]
similarities = np.zeros((ratings.shape[0], ratings.shape[0]))
similarities[arri, arrj] = np.abs(ratings[arri]-ratings[arrj])
I want to make a 2D-array similarities in that similarities[i, j] is the differentiation between row i and row j in ratings
[ValueError: shape mismatch: value array of shape (943,1682) could not be broadcast to indexing result of shape (943,)]
[1][1]: https://i.stack.imgur.com/gtst9.png
The problem is how numpy iterates through the array when indexing a two-dimentional array with two arrays.
First some setup:
import numpy;
ratings = numpy.arange(1, 6)
indicesX = numpy.indices((ratings.shape[0],1))[0]
indicesY = numpy.indices((ratings.shape[0],1))[0]
ratings: [1 2 3 4 5]
indicesX: [[0][1][2][3][4]]
indicesY: [[0][1][2][3][4]]
Now lets see what your program produces:
similarities = numpy.zeros((ratings.shape[0], ratings.shape[0]))
similarities[indicesX, indicesY] = numpy.abs(ratings[indicesX]-ratings[0])
similarities:
[[0. 0. 0. 0. 0.]
[0. 1. 0. 0. 0.]
[0. 0. 2. 0. 0.]
[0. 0. 0. 3. 0.]
[0. 0. 0. 0. 4.]]
As you can see, numpy iterates over similarities basically like the following:
for i in range(5):
similarities[indicesX[i], indicesY[i]] = numpy.abs(ratings[i]-ratings[0])
similarities:
[[0. 0. 0. 0. 0.]
[0. 1. 0. 0. 0.]
[0. 0. 2. 0. 0.]
[0. 0. 0. 3. 0.]
[0. 0. 0. 0. 4.]]
Now instead we need indices like the following to iterate through the entire array:
indecesX = [0,1,2,3,4,0,1,2,3,4,0,1,2,3,4,0,1,2,3,4,0,1,2,3,4]
indecesY = [0,0,0,0,0,1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4]
We do that the following:
# Reshape indicesX from (x,1) to (x,). Thats important for numpy.tile().
indicesX = indicesX.reshape(indicesX.shape[0])
indicesX = numpy.tile(indicesX, ratings.shape[0])
indicesY = numpy.repeat(indicesY, ratings.shape[0])
indicesX: [0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4]
indicesY: [0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4]
Perfect! Now just call similarities[indicesX, indicesY] = numpy.abs(ratings[indicesX]-ratings[indicesY]) again and we see:
similarities:
[[0. 1. 2. 3. 4.]
[1. 0. 1. 2. 3.]
[2. 1. 0. 1. 2.]
[3. 2. 1. 0. 1.]
[4. 3. 2. 1. 0.]]
Here the whole code again:
import numpy;
ratings = numpy.arange(1, 6)
indicesX = numpy.indices((ratings.shape[0],1))[0]
indicesY = numpy.indices((ratings.shape[0],1))[0]
similarities = numpy.zeros((ratings.shape[0], ratings.shape[0]))
indicesX = indicesX.reshape(indicesX.shape[0])
indicesX = numpy.tile(indicesX, ratings.shape[0])
indicesY = numpy.repeat(indicesY, ratings.shape[0])
similarities[indicesX, indicesY] = numpy.abs(ratings[indicesX]-ratings[indicesY])
print(similarities)
PS
You commented on your own post to improve it. You should edit your question instead of commenting on it, when you want to improve it.
I have a python numpy 3x4 array A:
A=np.array([[0,1,2,3],[4,5,6,7],[1,1,1,1]])
and a 3x3 array B:
B=np.array([[1,1, 1],[2, 2, 2],[3,3,3]])
I am trying to use a numpy operation to produce array C where each element in C is based on an equation using corresponding elements in A and the entire row in B. A simplified example:
C[row,col] = A[ro1,col] * ( A[row,col] / B[row,0] + B[row,1] + B[row,2) )
My first thoughts were to just simple and just multiply all of A by column in B. Error.
C = A * B[:,0]
Then I thought to try this but it didn't work.
C = A[:,:] * B[:,0]
I am not sure how to use the " : " operator and get access to the specific row, col at the same time. I can do this in regular loops but I wanted something more numpy.
mport numpy as np
A=np.array([[0,1,2,3],[4,5,6,7],[1,1,1,1]])
B=np.array([[1,1, 1],[2, 2, 2],[3,3,3]])
C=np.zeros([3,4])
row,col = A.shape
print(A.shape)
print(A)
print(B.shape)
print(B)
print(C.shape)
print(C)
print(range(row-1))
for row in range(row):
for col in range(col):
C[row,col] = A[row,col] * (( A[row,col] / B[row,0]) + B[row,1] + B[row,2])
print(C)
Which prints:
(3, 4)
[[0 1 2 3]
[4 5 6 7]
[1 1 1 1]]
(3, 3)
[[1 1 1]
[2 2 2]
[3 3 3]]
(3, 4)
[[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]]
range(0, 2)
[[ 0. 3. 8. 15. ]
[24. 32.5 42. 0. ]
[ 6.33333333 6.33333333 0. 0. ]]
Suggestions on a better way?
Edited:
Now that I understand broadcasting a bit more, and got that code running, let me expand in a generic way what I am trying to solve. I am trying to map values of a category such as "Air" which can be a range (such as 0-5) that have to be mapped to a shade of a given RGB value. The values are recorded over a time period.
For example, at time 1, the value of Water is 4. The standard RGB color for Water is Blue (0,0,255). There are 5 possible values for Water. In the case of Blue, 255 / 5 = 51. To get the effect of the 4 value on the Blue palette, multiply 51 x 4 = 204. Since we want higher values to be darker, we subtract 255 (white) - 205 yielding 51. The Red and Green components end up being 0. So the value read at time N is a multiply on the weighted R, G and B values. We invert 0 values to be subtracted from 255 so they appear white. Stronger values are darker.
So to calculate the R' G' and B' for time 1 I used:
answer = data[:,1:4] - (data[:,1:4] / data[:,[0]] * data[:,[4]])
I can extract an [R, G, B] from and answer and put into an Image at some x,y. Works good. But I can't figure out how to use Range, R, G and B and calculate new R', G', B' for all Time 1, 2, ... N. Trying to expand the numpy approach if possible. I did it with standard loops as:
for row in range(rows):
for col in range(cols):
r = int(data[row,1] - (data[row,1] / data[row,0] * data[row,col_offset+col] ))
g = int(data[row,2] - (data[row,2] / data[row,0] * data[row,col_offset+col] ))
b = int(data[row,3] - (data[row,3] / data[row,0] * data[row,col_offset+col] ))
almostImage[row,col] = [r,g,b]
I can display the image in matplotlib and save it to .png, etc. So I think next step is to try list comprehension over the time points 2D array, and then refer back to the range and RGB values. Will give it a try.
Try this:
A*(A / B[:,[0]] + B[:,1:].sum(1, keepdims=True))
Output:
array([[ 0. , 3. , 8. , 15. ],
[24. , 32.5 , 42. , 52.5 ],
[ 6.33333333, 6.33333333, 6.33333333, 6.33333333]])
Explanation:
The first operation A/B[:,[0]] utilizes numpy broadcasting.
Then B[:,1:].sum(1, keepdims=True) is just B[:,1] + B[:,2], and keepdims=True allows the dimension to stay the same. Print it to see details.
I have an empty matrix M.shape:
(179, 179)
Now I want to populate it using the following loop:
for game in range(len(games)-1):
df_round = df_games_position[df_games_position['rodada_id'] == games['rodada_id'][game]]
players_home = df_round[df_round['time_id'] == games['time_id'][game]]
players_away = df_round[df_round['time_id'] == games['adversario_id'][game]]
count=0
for j_home in range(len(players_home)):
count_fora=0
for j_away in range(len(players_away)):
score_home = 0
score_away = 0
points_j_home = players_home['points_num'].iloc[j_home]
points_j_away = players_away['points_num'].iloc[j_away]
print ('POINTS HOME',points_j_home)
print ('POINTS AWAY',points_j_away)
soma = points_j_home + points_j_away
if soma != 0:
score_home = points_j_home / soma
score_away = points_j_away / soma
print ('SCORE HOME', score_home)
print ('SCORE AWAY',score_away)
j1 = players_home['Rank'].iloc[j_home].astype('int64')
j2 = players_away['Rank'].iloc[j_away].astype('int64')
print ('j1',j1)
print ('j2',j2)
M[j1,j1] = M[j1,j1] + games['goals_home_norm'][game] + score_home
M[j1,j2] = M[j1,j2] + games['goals_away_norm'][game] + score_away
M[j2,j1] = M[j2,j1] + games['goals_home_norm'][game] + score_home
M[j2,j2] = M[j2,j2] + games['goals_away_norm'][game] + score_away
print (M)
count+=1
print ('COUNT', count)
Finally I get the error:
M[j1,j2] = M[j1,j2] + games['gols_fora_norm'][game] + score_home
IndexError: index 179 is out of bounds for axis 1 with size 179
My last iteration round of prints:
COUNT 3
SCORE HOME 0.0
SCORE AWAY 0.0
j1 7
j2 162
[[0. 0. 0. ... 0. 0. 0. ]
[0. 0. 0. ... 0. 0. 0. ]
[0. 0. 0. ... 0. 0. 0. ]
...
[0. 0. 0. ... 0. 0. 0. ]
[0. 0. 0. ... 0. 0. 0. ]
[0. 0. 0. ... 0. 0. 8.57263145]]
COUNT 4
SCORE HOME 0.0
SCORE AWAY 0.0
j1 7
j2 179
What am I missing?
Matrix is numpy array, and the index for it is start with 0 not 1
np.array([1,2,3,4]).shape
Out[29]: (4,)
np.array([1,2,3,4])[3]
Out[30]: 4
We can simple fix it create the empty M with shape (180,180)
M = M[1:,1:]
I have a NumPy ndarray that looks like:
[[ 0 0 0 1 0]
[ 0 0 0 0 1]]
but I would like to process it to the following form:
[[ 0. 0. 0. 1. 0.]
[ 0. 0. 0. 0. 1.]]
How would I achieve this?
It looks to me like you have an array of some integer type. You probably want to convert to an array of float:
array_float = array_int.astype(float)
e.g.:
>>> ones_i = np.ones(10, dtype=int)
>>> print ones_i
[1 1 1 1 1 1 1 1 1 1]
>>> ones_f = ones_i.astype(float)
>>> print ones_f
[ 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
With that said, I think that it is worth asking why you want to process the string representation of your array. There very well might be a better way to accomplish your goal.