Python Non Recursive Depth First Search - python

I have written some code to identify the connected components in a binary image. I have used recursive depth first search. However, for some images, the Python Recursion Limit is not enough. Even though I increase the limit to the maximum supported limit on my computer, the program still fails for some images. How can I iteratively implement DFS? Or is there any other better solution?
My code:
count=1
height = 4
width = 5
g = np.zeros((height+2,width+2))
w = np.zeros((height+2,width+2))
dx = [-1,0,1,1,1,0,-1,-1]
dy = [1,1,1,0,-1,-1,-1,0]
def dfs(x,y,c):
global w
w[x][y]=c
for i in range(8):
nx = x+dx[i]
ny = y+dy[i]
if g[nx][ny] and not w[nx][ny]:
dfs(nx,ny,c)
def find_connected_components(image):
global count,g
g[1:-1,1:-1]=image
for i in range(1,height+1):
for j in range(1,width+1):
if g[i][j] and not w[i][j]:
dfs(i,j,count)
count+=1
mask1 = np.array([[0,0,0,0,1],[0,1,1,0,1],[0,0,1,0,0],[1,0,0,0,1]])
find_connected_components(mask1)
print mask1
print w[1:-1,1:-1]
Input and Output:
[[0 0 0 0 1]
[0 1 1 0 1]
[0 0 1 0 0]
[1 0 0 0 1]]
[[ 0. 0. 0. 0. 1.]
[ 0. 2. 2. 0. 1.]
[ 0. 0. 2. 0. 0.]
[ 3. 0. 0. 0. 4.]]

Have a list of locations to visit
Use a while loop visiting each location, popping it out of the list as you do.
Like so:
def dfs(x,y,c):
global w
locs = [(x,y,c)]
while locs:
x,y,c = locs.pop()
w[x][y]=c
for i in range(8):
nx = x+dx[i]
ny = y+dy[i]
if g[nx][ny] and not w[nx][ny]:
locs.append((nx, ny, c))

Related

Referencing index for vectorization in NumPy

I have a couple of for loops that I want to vectorize in order to improve performance. They operate on 1 x N matrices.
for y in range(1, len(array[0]) + 1):
array[0, y - 1] = np.floor(np.nanmean(otherArray[0, ((y-1)*3):((y-1)*3+3)]))
for i in range(len(array[0])):
array[0, int((i-1)*L+1)] = otherArray[0, i]
The operations are reliant on the index of the array which is given by the for loop. Is there any way to access the index while using numpy.vectorize so that I can rewrite these as vectorized functions?
First loop:
import numpy as np
array = np.zeros((1, 10))
otherArray = np.arange(30).reshape(1, -1)
print(f'array = \n{array}')
print(f'otherArray = \n{otherArray}')
for y in range(1, len(array[0]) + 1):
array[0, y - 1] = np.floor(np.nanmean(otherArray[0, ((y-1)*3):((y-1)*3+3)]))
print(f'array = \n{array}')
array = np.floor(np.nanmean(otherArray.reshape(-1, 3), axis = 1)).reshape(1, -1)
print(f'array = \n{array}')
output:
array =
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]
otherArray =
[[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
24 25 26 27 28 29]]
array =
[[ 1. 4. 7. 10. 13. 16. 19. 22. 25. 28.]]
array =
[[ 1. 4. 7. 10. 13. 16. 19. 22. 25. 28.]]
Second loop:
array = np.zeros((1, 10))
otherArray = np.arange(10, dtype = float).reshape(1, -1)
L = 1
print(f'array = \n{array}')
print(f'otherArray = \n{otherArray}')
for i in range(len(otherArray[0])):
array[0, int((i-1)*L+1)] = otherArray[0, i]
print(f'array = \n{array}')
array = otherArray
print(f'array = \n{array}')
output:
array =
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]
otherArray =
[[0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]]
array =
[[0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]]
array =
[[0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]]
It looks like in the first loop you are trying to compute a moving average. This is best done like this:
import numpy as np
window_width = 3
arr = np.arange(12)
out = np.floor(np.nanmean(arr.reshape(-1,window_width) ,axis=-1))
print(out)
Regarding your second loop, I have no clue what it does. You are trying to copy values from otherArray to array with some offset? I’d recommend you look at numpy’s slicing functionality.

Is it possible to find similarities between rows in a matrix without loop?

i have a 2D numpy array. I'm trying to compute the similarities between rows and put it into a similarities array. Is this possible without loop? Thanks for your time!
# ratings.shape = (943, 1682)
arri = np.zeros(943)
arri = np.where(arri == 0)[0]
arrj = np.zeros(943)
arrj = np.where(arrj ==0)[0]
similarities = np.zeros((ratings.shape[0], ratings.shape[0]))
similarities[arri, arrj] = np.abs(ratings[arri]-ratings[arrj])
I want to make a 2D-array similarities in that similarities[i, j] is the differentiation between row i and row j in ratings
[ValueError: shape mismatch: value array of shape (943,1682) could not be broadcast to indexing result of shape (943,)]
[1][1]: https://i.stack.imgur.com/gtst9.png
The problem is how numpy iterates through the array when indexing a two-dimentional array with two arrays.
First some setup:
import numpy;
ratings = numpy.arange(1, 6)
indicesX = numpy.indices((ratings.shape[0],1))[0]
indicesY = numpy.indices((ratings.shape[0],1))[0]
ratings: [1 2 3 4 5]
indicesX: [[0][1][2][3][4]]
indicesY: [[0][1][2][3][4]]
Now lets see what your program produces:
similarities = numpy.zeros((ratings.shape[0], ratings.shape[0]))
similarities[indicesX, indicesY] = numpy.abs(ratings[indicesX]-ratings[0])
similarities:
[[0. 0. 0. 0. 0.]
[0. 1. 0. 0. 0.]
[0. 0. 2. 0. 0.]
[0. 0. 0. 3. 0.]
[0. 0. 0. 0. 4.]]
As you can see, numpy iterates over similarities basically like the following:
for i in range(5):
similarities[indicesX[i], indicesY[i]] = numpy.abs(ratings[i]-ratings[0])
similarities:
[[0. 0. 0. 0. 0.]
[0. 1. 0. 0. 0.]
[0. 0. 2. 0. 0.]
[0. 0. 0. 3. 0.]
[0. 0. 0. 0. 4.]]
Now instead we need indices like the following to iterate through the entire array:
indecesX = [0,1,2,3,4,0,1,2,3,4,0,1,2,3,4,0,1,2,3,4,0,1,2,3,4]
indecesY = [0,0,0,0,0,1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4]
We do that the following:
# Reshape indicesX from (x,1) to (x,). Thats important for numpy.tile().
indicesX = indicesX.reshape(indicesX.shape[0])
indicesX = numpy.tile(indicesX, ratings.shape[0])
indicesY = numpy.repeat(indicesY, ratings.shape[0])
indicesX: [0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4]
indicesY: [0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4]
Perfect! Now just call similarities[indicesX, indicesY] = numpy.abs(ratings[indicesX]-ratings[indicesY]) again and we see:
similarities:
[[0. 1. 2. 3. 4.]
[1. 0. 1. 2. 3.]
[2. 1. 0. 1. 2.]
[3. 2. 1. 0. 1.]
[4. 3. 2. 1. 0.]]
Here the whole code again:
import numpy;
ratings = numpy.arange(1, 6)
indicesX = numpy.indices((ratings.shape[0],1))[0]
indicesY = numpy.indices((ratings.shape[0],1))[0]
similarities = numpy.zeros((ratings.shape[0], ratings.shape[0]))
indicesX = indicesX.reshape(indicesX.shape[0])
indicesX = numpy.tile(indicesX, ratings.shape[0])
indicesY = numpy.repeat(indicesY, ratings.shape[0])
similarities[indicesX, indicesY] = numpy.abs(ratings[indicesX]-ratings[indicesY])
print(similarities)
PS
You commented on your own post to improve it. You should edit your question instead of commenting on it, when you want to improve it.

Combining two numpy arrays with equations based on both arrays

I have a python numpy 3x4 array A:
A=np.array([[0,1,2,3],[4,5,6,7],[1,1,1,1]])
and a 3x3 array B:
B=np.array([[1,1, 1],[2, 2, 2],[3,3,3]])
I am trying to use a numpy operation to produce array C where each element in C is based on an equation using corresponding elements in A and the entire row in B. A simplified example:
C[row,col] = A[ro1,col] * ( A[row,col] / B[row,0] + B[row,1] + B[row,2) )
My first thoughts were to just simple and just multiply all of A by column in B. Error.
C = A * B[:,0]
Then I thought to try this but it didn't work.
C = A[:,:] * B[:,0]
I am not sure how to use the " : " operator and get access to the specific row, col at the same time. I can do this in regular loops but I wanted something more numpy.
mport numpy as np
A=np.array([[0,1,2,3],[4,5,6,7],[1,1,1,1]])
B=np.array([[1,1, 1],[2, 2, 2],[3,3,3]])
C=np.zeros([3,4])
row,col = A.shape
print(A.shape)
print(A)
print(B.shape)
print(B)
print(C.shape)
print(C)
print(range(row-1))
for row in range(row):
for col in range(col):
C[row,col] = A[row,col] * (( A[row,col] / B[row,0]) + B[row,1] + B[row,2])
print(C)
Which prints:
(3, 4)
[[0 1 2 3]
[4 5 6 7]
[1 1 1 1]]
(3, 3)
[[1 1 1]
[2 2 2]
[3 3 3]]
(3, 4)
[[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]]
range(0, 2)
[[ 0. 3. 8. 15. ]
[24. 32.5 42. 0. ]
[ 6.33333333 6.33333333 0. 0. ]]
Suggestions on a better way?
Edited:
Now that I understand broadcasting a bit more, and got that code running, let me expand in a generic way what I am trying to solve. I am trying to map values of a category such as "Air" which can be a range (such as 0-5) that have to be mapped to a shade of a given RGB value. The values are recorded over a time period.
For example, at time 1, the value of Water is 4. The standard RGB color for Water is Blue (0,0,255). There are 5 possible values for Water. In the case of Blue, 255 / 5 = 51. To get the effect of the 4 value on the Blue palette, multiply 51 x 4 = 204. Since we want higher values to be darker, we subtract 255 (white) - 205 yielding 51. The Red and Green components end up being 0. So the value read at time N is a multiply on the weighted R, G and B values. We invert 0 values to be subtracted from 255 so they appear white. Stronger values are darker.
So to calculate the R' G' and B' for time 1 I used:
answer = data[:,1:4] - (data[:,1:4] / data[:,[0]] * data[:,[4]])
I can extract an [R, G, B] from and answer and put into an Image at some x,y. Works good. But I can't figure out how to use Range, R, G and B and calculate new R', G', B' for all Time 1, 2, ... N. Trying to expand the numpy approach if possible. I did it with standard loops as:
for row in range(rows):
for col in range(cols):
r = int(data[row,1] - (data[row,1] / data[row,0] * data[row,col_offset+col] ))
g = int(data[row,2] - (data[row,2] / data[row,0] * data[row,col_offset+col] ))
b = int(data[row,3] - (data[row,3] / data[row,0] * data[row,col_offset+col] ))
almostImage[row,col] = [r,g,b]
I can display the image in matplotlib and save it to .png, etc. So I think next step is to try list comprehension over the time points 2D array, and then refer back to the range and RGB values. Will give it a try.
Try this:
A*(A / B[:,[0]] + B[:,1:].sum(1, keepdims=True))
Output:
array([[ 0. , 3. , 8. , 15. ],
[24. , 32.5 , 42. , 52.5 ],
[ 6.33333333, 6.33333333, 6.33333333, 6.33333333]])
Explanation:
The first operation A/B[:,[0]] utilizes numpy broadcasting.
Then B[:,1:].sum(1, keepdims=True) is just B[:,1] + B[:,2], and keepdims=True allows the dimension to stay the same. Print it to see details.

IndexError: index n is out of bounds for axis 1 with size n

I have an empty matrix M.shape:
(179, 179)
Now I want to populate it using the following loop:
for game in range(len(games)-1):
df_round = df_games_position[df_games_position['rodada_id'] == games['rodada_id'][game]]
players_home = df_round[df_round['time_id'] == games['time_id'][game]]
players_away = df_round[df_round['time_id'] == games['adversario_id'][game]]
count=0
for j_home in range(len(players_home)):
count_fora=0
for j_away in range(len(players_away)):
score_home = 0
score_away = 0
points_j_home = players_home['points_num'].iloc[j_home]
points_j_away = players_away['points_num'].iloc[j_away]
print ('POINTS HOME',points_j_home)
print ('POINTS AWAY',points_j_away)
soma = points_j_home + points_j_away
if soma != 0:
score_home = points_j_home / soma
score_away = points_j_away / soma
print ('SCORE HOME', score_home)
print ('SCORE AWAY',score_away)
j1 = players_home['Rank'].iloc[j_home].astype('int64')
j2 = players_away['Rank'].iloc[j_away].astype('int64')
print ('j1',j1)
print ('j2',j2)
M[j1,j1] = M[j1,j1] + games['goals_home_norm'][game] + score_home
M[j1,j2] = M[j1,j2] + games['goals_away_norm'][game] + score_away
M[j2,j1] = M[j2,j1] + games['goals_home_norm'][game] + score_home
M[j2,j2] = M[j2,j2] + games['goals_away_norm'][game] + score_away
print (M)
count+=1
print ('COUNT', count)
Finally I get the error:
M[j1,j2] = M[j1,j2] + games['gols_fora_norm'][game] + score_home
IndexError: index 179 is out of bounds for axis 1 with size 179
My last iteration round of prints:
COUNT 3
SCORE HOME 0.0
SCORE AWAY 0.0
j1 7
j2 162
[[0. 0. 0. ... 0. 0. 0. ]
[0. 0. 0. ... 0. 0. 0. ]
[0. 0. 0. ... 0. 0. 0. ]
...
[0. 0. 0. ... 0. 0. 0. ]
[0. 0. 0. ... 0. 0. 0. ]
[0. 0. 0. ... 0. 0. 8.57263145]]
COUNT 4
SCORE HOME 0.0
SCORE AWAY 0.0
j1 7
j2 179
What am I missing?
Matrix is numpy array, and the index for it is start with 0 not 1
np.array([1,2,3,4]).shape
Out[29]: (4,)
np.array([1,2,3,4])[3]
Out[30]: 4
We can simple fix it create the empty M with shape (180,180)
M = M[1:,1:]

How to add a dot in python numpy ndarray - data type issue

I have a NumPy ndarray that looks like:
[[ 0 0 0 1 0]
[ 0 0 0 0 1]]
but I would like to process it to the following form:
[[ 0. 0. 0. 1. 0.]
[ 0. 0. 0. 0. 1.]]
How would I achieve this?
It looks to me like you have an array of some integer type. You probably want to convert to an array of float:
array_float = array_int.astype(float)
e.g.:
>>> ones_i = np.ones(10, dtype=int)
>>> print ones_i
[1 1 1 1 1 1 1 1 1 1]
>>> ones_f = ones_i.astype(float)
>>> print ones_f
[ 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
With that said, I think that it is worth asking why you want to process the string representation of your array. There very well might be a better way to accomplish your goal.

Categories

Resources