Related
This is my first Question here, let me know if I could've done anything better.
I'm trying to do a element-wise operation between two arrays, but the broadcasting wont work like I want it to.
I have an array of shape (N,4).
square_list = np.array([[1,2,255,255], [255,255,4,4], [255,255,8,8], [255,255,16,16], [255,255,8,4], [255,1,8,8], [1,255,8,8]], dtype='B')
I also have an array of shape (4,).
square = np.array([1, 8, 8, 1], dtype='B')
What I am able to do is compare my square against each element in the square_list and it is being broadcast into shape (N,4) as expected.
Now I want to compare my square in each possible rotation against the square_list. I've written a function which returns an array of shape (4,4), which contains each possible rotation.
square.rotations
array([[1, 8, 8, 1],
[1, 1, 8, 8],
[8, 1, 1, 8],
[8, 8, 1, 1]], dtype=uint8)
I know how to do this using a loop. I'd prefer however to use an element-wise operator that returns my desired shape.
What I get:
rotations & square_list
ValueError: operands could not be broadcast together with shapes (4,4) (6,4)
What I'd like to get:
rotations & square_list
array([[[1, 0, 8, 1],
[1, 8, 0, 0],
[1, 8, 8, 0],
[1, 8, 0, 0],
[1, 8, 8, 0],
[1, 0, 8, 0]],
[[1, 0, 8, 8],
[1, 1, 0, 0],
[1, 1, 8, 8],
[1, 1, 0, 0],
[1, 1, 8, 0],
[1, 1, 8, 8]],
[[0, 0, 1, 8],
[8, 1, 0, 0],
[8, 1, 0, 8],
[8, 1, 0, 0],
[8, 1, 0, 0],
[8, 1, 0, 8],
[0, 1, 0, 8]],
[[0, 0, 1, 1],
[8, 8, 0, 0],
[8, 8, 0, 0],
[8, 8, 0, 0],
[8, 8, 0, 0],
[8, 0, 0, 0],
[0, 8, 0, 0]]], dtype=uint8)
This is just to visualize what I want, I don't particularly care about the order of the axis'. A shape of either (4, N, 4) or (N, 4, 4) would be great.
I have the feeling that this can be achieved easily by just reshaping one of the input arrays but I couldn't figure it out.
Thanks in advance!
Add an extra dimension to rotations:
square_list & rotations[:,None]
output:
array([[[1, 0, 8, 1],
[1, 8, 0, 0],
[1, 8, 8, 0],
[1, 8, 0, 0],
[1, 8, 8, 0],
[1, 0, 8, 0],
[1, 8, 8, 0]],
[[1, 0, 8, 8],
[1, 1, 0, 0],
[1, 1, 8, 8],
[1, 1, 0, 0],
[1, 1, 8, 0],
[1, 1, 8, 8],
[1, 1, 8, 8]],
[[0, 0, 1, 8],
[8, 1, 0, 0],
[8, 1, 0, 8],
[8, 1, 0, 0],
[8, 1, 0, 0],
[8, 1, 0, 8],
[0, 1, 0, 8]],
[[0, 0, 1, 1],
[8, 8, 0, 0],
[8, 8, 0, 0],
[8, 8, 0, 0],
[8, 8, 0, 0],
[8, 0, 0, 0],
[0, 8, 0, 0]]], dtype=uint8)
I have the following Matrix made of 0s and 1s which I want to identify its spots(elements with the value 1 and connected to eachothers).
M = np.array([[1,1,1,0,0,0,0,0,0,0,0],
[1,1,1,0,0,0,0,0,0,1,1],
[1,1,1,0,0,0,0,0,0,1,1],
[1,1,1,0,0,1,1,1,0,0,0],
[0,0,0,0,0,1,1,1,0,0,0],
[1,1,1,0,1,1,1,1,0,0,0],
[1,1,1,0,0,1,1,1,0,0,0],
[1,1,1,0,0,1,1,1,0,0,0]])
In the matrix there are four spots.
an example of my output should seem the following
spot_0 = array[(0,0),(0,1), (0,2), (1,0),(1,1), (1,2), (2,0),(2,1), (2,2), (3,0),(3,1), (3,2)]
Nbr_0 = 12
Top_Left = (0, 0)
and that is the same process for the other 3 spots
Does anyone know how can I identify each spot with the number of its elements and top_left element, using numpy functions ?
Thanks
You can use a connected component labeling to find the spots. Then, you can use np.max so to find the number of component and np.argwhere so to find the locations of each component. Here is an example:
# OpenCV provides a similar function
from skimage.measure import label
components = label(M)
# array([[1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
# [1, 1, 1, 0, 0, 0, 0, 0, 0, 2, 2],
# [1, 1, 1, 0, 0, 0, 0, 0, 0, 2, 2],
# [1, 1, 1, 0, 0, 3, 3, 3, 0, 0, 0],
# [0, 0, 0, 0, 0, 3, 3, 3, 0, 0, 0],
# [4, 4, 4, 0, 3, 3, 3, 3, 0, 0, 0],
# [4, 4, 4, 0, 0, 3, 3, 3, 0, 0, 0],
# [4, 4, 4, 0, 0, 3, 3, 3, 0, 0, 0]])
for i in range(1, np.max(components)+1):
spot_i = np.argwhere(components == i)
Nbr_i = len(spot_i)
Top_Left_i = spot_i[0]
Note that Top_Left only make sense for a rectangular area. If they are not rectangular this point needs to be carefully defined.
Note also that this method is only efficient with few component. If there are many component, then it is better to replace the current loop by an iteration over the components array (in this case the output structure is stored in a list l and l[components[i,j]] is updated with the information found for all item location (i,j) of components). This last algorithm will be slow unless Numba/Cython are used to speed the process up.
You could use skimage.measure.label or other tools (for instance, OpenCV or igraph) to create labels for connected components:
#from #Jérôme's answer
from skimage.measure import label
components = label(M)
# array([[1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
# [1, 1, 1, 0, 0, 0, 0, 0, 0, 2, 2],
# [1, 1, 1, 0, 0, 0, 0, 0, 0, 2, 2],
# [1, 1, 1, 0, 0, 3, 3, 3, 0, 0, 0],
# [0, 0, 0, 0, 0, 3, 3, 3, 0, 0, 0],
# [4, 4, 4, 0, 3, 3, 3, 3, 0, 0, 0],
# [4, 4, 4, 0, 0, 3, 3, 3, 0, 0, 0],
# [4, 4, 4, 0, 0, 3, 3, 3, 0, 0, 0]])
In the later part you could create a one-dimensional view of image, sort values of pixels and find dividing points of sorted label values:
components_ravel = components.ravel()
c = np.arange(1, np.max(components_ravel) + 1)
argidx = np.argsort(components_ravel)
div_points = np.searchsorted(components_ravel, c, sorter=argidx)
# Sorted label values are:
# [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
# 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
# 2, 2, 2, 2
# 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3
# 4, 4, 4, 4, 4, 4, 4, 4, 4
# So you find indices that divides these groups:
# [47, 59, 63, 79]
After that you could split array of indices that sorts your one-dimensional view of image at these points and convert them into two-dimensional ones:
spots = []
for n in np.split(argidx, div_points)[1:]: #in case there are no zeros, cancel `[1:]`
x, y = np.unravel_index(n, components.shape)
spots.append(np.transpose([x, y]))
It creates a list of spot coordinates of each group:
[array([[1, 0], [1, 2], [0, 2], [0, 1], [1, 1], [0, 0], [2, 2], [2, 1], [2, 0], [3, 2], [3, 1], [3, 0]]),
array([[2, 10], [1, 9], [2, 9], [1, 10]]),
array([[6, 5], [7, 5], [7, 6], [7, 7], [6, 7], [6, 6], [3, 5], [4, 6], [3, 6], [4, 5], [3, 7], [5, 7], [5, 6], [4, 7], [5, 5], [5, 4]]),
array([[5, 0], [5, 1], [5, 2], [6, 2], [7, 0], [6, 0], [6, 1], [7, 1], [7, 2]])]
Note that an order of pixels of each group is mixed. This is because np.argsort uses a sort which is not stable. You could fix it like so:
argidx = np.argsort(components_ravel, kind='stable')
In this case you'll get:
[array([[0, 0], [0, 1], [0, 2], [1, 0], [1, 1], [1, 2], [2, 0], [2, 1], [2, 2], [3, 0], [3, 1], [3, 2]]),
array([[1, 9], [1, 10], [2, 9], [2, 10]]),
array([[3, 5], [3, 6], [3, 7], [4, 5], [4, 6], [4, 7], [5, 4], [5, 5], [5, 6], [5, 7], [6, 5], [6, 6], [6, 7], [7, 5], [7, 6], [7, 7]]),
array([[5, 0], [5, 1], [5, 2], [6, 0], [6, 1], [6, 2], [7, 0], [7, 1], [7, 2]])]
num1 = [1,2,3,4,5]
num2 = [1,2,3,4,5]
arr1 = [[0]*(len(num2)+1)]*(len(num1)+1)
arr2 = [[0 for _ in range(len(num2)+1)] for _ in range(len(num1)+1)]
I get a different answer when I define arr1 and arr2.
Aren't arr1 and arr2 create the same 2D array?
They are not the same. arr1 is a list with (len(nums1)+1) references to the same list [0]*(len(nums2)+1). So when you modify an element in one of them, all references will see this change as well.
For example,
>>> arr1[0][0] += 1
>>> print(arr1)
[[1, 0, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 0]]
arr2 doesn't suffer from this problem because it has len(nums1)+1 distinct lists:
>>> arr2[0][0] += 1
>>> print(arr2)
[[1, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0]]
A better way to see the difference is to use a random number to fill the entries.
from random import randrange
num1 = [1,2,3,4,5]
num2 = [1,2,3,4,5]
arr1 = [[randrange(10)]*(len(nums2)+1)]*(len(nums1)+1)
arr2 = [[randrange(10) for _ in range(len(nums2)+1)] for _ in range(len(nums1)+1)]
print(arr1)
print(arr2)
The output is:
[[5, 5, 5, 5, 5, 5], [5, 5, 5, 5, 5, 5], [5, 5, 5, 5, 5, 5], [5, 5, 5, 5, 5, 5], [5, 5, 5, 5, 5, 5], [5, 5, 5, 5, 5, 5]]
[[7, 4, 2, 4, 0, 3], [7, 5, 1, 0, 1, 7], [4, 4, 1, 0, 2, 1], [2, 3, 6, 2, 6, 7], [6, 6, 6, 0, 3, 3], [0, 4, 5, 0, 6, 6]]
You can see that for the arr1, it populates every entry with the same number; while for arr2, the entries are all truly random. This is because arr1 is constructed by expanding a list of just one number, which is [5] here.
I am using the matrix for the multiple sequence alignment and this is my score matrix which I got by running the alignment algorithm.
My matrix:
[
[0, 24, -5, 3, -3, -5],
[0, -4, 8, 1, 1],
[0, 13, 1, 2],
[0, -2, 5],
[0, 4],
[0]
]
Matrix I want to build:
[
[0, 24, -5, 3, -3, -5],
[24, 0, -4, 8, 1, 1],
[-5, -4, 0, 13, 1, 2],
[3, 8, 13, 0, -2, 5],
[-3, 1, 1, 2, 0, 4],
[-5, 1, 2, 5, 4, 0]
]
I am trying to create a symmetric matrix from the output I got in python without using NumPy and additional library. I have tried to implement using NumPy but I want to implement without using NumPy.
Try the following:
upper = [[0, 24, -5, 3, -3, -5], [0, -4, 8, 1, 1], [0, 13, 1, 2], [0, -2, 5], [0, 4], [0]]
n = len(upper) # 6: num of rows and cols (assuming square)
output = []
for i in range(n): # iterate over rows
row = [(upper[i][j - i] if j >= i else output[j][i]) for j in range(n)]
output.append(row)
print(output)
# [[0, 24, -5, 3, -3, -5], [24, 0, -4, 8, 1, 1], [-5, -4, 0, 13, 1, 2], [3, 8, 13, 0, -2, 5], [-3, 1, 1, -2, 0, 4], [-5, 1, 2, 5, 4, 0]]
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I wrote a backtracking Sudoku solving algorithm in Python.
It solves a 2D array like this (zero means "empty field"):
[
[7, 0, 0, 0, 0, 9, 0, 0, 3],
[0, 9, 0, 1, 0, 0, 8, 0, 0],
[0, 1, 0, 0, 0, 7, 0, 0, 0],
[0, 3, 0, 4, 0, 0, 0, 8, 0],
[6, 0, 0, 0, 8, 0, 0, 0, 1],
[0, 7, 0, 0, 0, 2, 0, 3, 0],
[0, 0, 0, 5, 0, 0, 0, 1, 0],
[0, 0, 4, 0, 0, 3, 0, 9, 0],
[5, 0, 0, 7, 0, 0, 0, 0, 2],
]
like this:
[
[7, 5, 8, 2, 4, 9, 1, 6, 3],
[4, 9, 3, 1, 5, 6, 8, 2, 7],
[2, 1, 6, 8, 3, 7, 4, 5, 9],
[9, 3, 5, 4, 7, 1, 2, 8, 6],
[6, 4, 2, 3, 8, 5, 9, 7, 1],
[8, 7, 1, 9, 6, 2, 5, 3, 4],
[3, 2, 7, 5, 9, 4, 6, 1, 8],
[1, 8, 4, 6, 2, 3, 7, 9, 5],
[5, 6, 9, 7, 1, 8, 3, 4, 2]
]
But for "hard" Sudokus (where there are a lot of zeros at the beginning), it's quite slow. It takes the algorithm around 9 seconds to solve the Sudoku above. That's a lot better then what I startet with (90 seconds), but still slow.
I think that the "deepcopy" can somehow be improved/replaced (because it is executed 103.073 times in the example below), but my basic approaches were slower..
I heard of 0.01 second C/C++ solutions but I'm not sure if those are backtracking algorithms of some kind of mathematical solution...
This is my whole algorithm with 2 example Sudokus:
from copy import deepcopy
def is_sol_row(mat,row,val):
m = len(mat)
for i in range(m):
if mat[row][i] == val:
return False
return True
def is_sol_col(mat,col,val):
m = len(mat)
for i in range(m):
if mat[i][col] == val:
return False
return True
def is_sol_block(mat,row,col,val):
rainbow = [0,0,0,3,3,3,6,6,6]
i = rainbow[row]
j = rainbow[col]
elements = {
mat[i + 0][j + 0], mat[i + 1][j + 0], mat[i + 2][j + 0],
mat[i + 0][j + 1], mat[i + 1][j + 1], mat[i + 2][j + 1],
mat[i + 0][j + 2], mat[i + 1][j + 2], mat[i + 2][j + 2],
}
if val in elements:
return False
return True
def is_sol(mat,row,col,val):
return is_sol_row(mat,row,val) and is_sol_col(mat,col,val) and is_sol_block(mat,row,col,val)
def findAllZeroIndizes(mat):
m = len(mat)
indizes = []
for i in range(m):
for j in range(m):
if mat[i][j] == 0:
indizes.append((i,j))
return indizes
def sudoku(mat):
q = [(mat,0)]
zeroIndizes = findAllZeroIndizes(mat)
while q:
t,numSolvedIndizes = q.pop()
if numSolvedIndizes == len(zeroIndizes):
return t
else:
i,j = zeroIndizes[numSolvedIndizes]
for k in range(1,10):
if is_sol(t,i,j,k):
newt = deepcopy(t)
newt[i][j] = k
q.append((newt,numSolvedIndizes+1))
return False
mat = [
[7, 0, 0, 0, 0, 9, 0, 0, 3],
[0, 9, 0, 1, 0, 0, 8, 0, 0],
[0, 1, 0, 0, 0, 7, 0, 0, 0],
[0, 3, 0, 4, 0, 0, 0, 8, 0],
[6, 0, 0, 0, 8, 0, 0, 0, 1],
[0, 7, 0, 0, 0, 2, 0, 3, 0],
[0, 0, 0, 5, 0, 0, 0, 1, 0],
[0, 0, 4, 0, 0, 3, 0, 9, 0],
[5, 0, 0, 7, 0, 0, 0, 0, 2],
]
# mat = [
# [3, 0, 6, 5, 0, 8, 4, 0, 0],
# [5, 2, 0, 0, 0, 0, 0, 0, 0],
# [0, 8, 7, 0, 0, 0, 0, 3, 1],
# [0, 0, 3, 0, 1, 0, 0, 8, 0],
# [9, 0, 0, 8, 6, 3, 0, 0, 5],
# [0, 5, 0, 0, 9, 0, 6, 0, 0],
# [1, 3, 0, 0, 0, 0, 2, 5, 0],
# [0, 0, 0, 0, 0, 0, 0, 7, 4],
# [0, 0, 5, 2, 0, 6, 3, 0, 0]
# ]
print(sudoku(mat))
The largest time sink is that, for every open position, you try each of the nine digits, without learning anything about the attempts. Your test grid has 56 open grid locations, so anything you do is magnified through that lens. A little preprocessing will go a long way. For instance, make a list of available numbers in each row and column. Key that appropriately, and use that for your search instead of range(m).
Another technique is to apply simple algorithms to make trivial placements as they become available. For instance, you can quickly derive the 1 in the upper-left block, and the missing 7s in the left and middle columns of blocks. This alone cuts the solution time in half. Wherever you're down to a single choice for what number goes in a selected open square, or where a selected number can be placed in a particular row/col/block, then make that placement before you engage in exhaustive backtracking.