Allocating array (list) algorithm permutations in python - python

So i'm having trouble to solve the following problem:
given an array size, lets say for the ease of the question size =20
it is filled with zeros as follows
arr = [0]*20 ==> [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
we have couple of constant sample sizes, such as 4,3,2
SampleA=4, SampleB=3, SampleC= 2
i need to have the permutations/variations of how to allocate the list.
i can put each sample in different place/index
for example, sampleA= 4 i can put it in indexes of 0:3, or 1:4... 15:19..
(as you can see, there are quite a lot of possibilities)
the thing gets complicated once it get more crowded, for example:
3+2+3 +4
[0, x, x, x, 0, 0, x x, 0, x, x, x, 0, 0, 0, 0, x, x,x, x]
what i basically need, is to find all the possibilities to allocate samples,
i get a dictionary:
key = sample size of indexes, and the
value=many times it repeats.
for the upper example: {3:2,2:1,4:1}
and i would like the function to return a list of indexes !=0
for this example:
[0, x, x, x, 0, 0, x x, 0, x, x, x, 0, 0, 0, 0, x, x,x, x]
the function will return:
list_ind = [0,5,6,9,13,14,15,16]

so i asked a colleague to help, and we came into a solution:
i put an example of:
4:2, 3:1, 2:1
or verbally:
two times 4, one time 3, one time 2
the code below:
*if someone can optimize, will be great
size_of_wagon = 20
dct = {4:2,3:1,2:1}
l = [item for sublist in [[k] * v for (k, v) in dct.items()] for item in sublist]
def gen_func(lstOfSamples, length, shift):
try:
# print(f'lstOfSamples={lstOfSamples}')
sample = lstOfSamples[0] # Take first sample
for i in range(length - sample):
for x in gen_func(lstOfSamples[1:], length - (sample + i), shift + sample + i):
yield [(shift + i, sample)] + x
except:
yield []
g = list(gen_func(l, size_of_wagon, 0))
for i in g:
print(i)
print(len(g))

Related

python function to count nonzero patches in array

For a given array (1 or 2-dimensional) I would like to know, how many "patches" there are of nonzero elements. For example, in the array [0, 0, 1, 1, 0, 1, 0, 0] there are two patches.
I came up with a function for the 1-dimensional case, where I first assume the maximal number of patches and then decrease that number if a neighbor of a nonzero element is nonzero, too.
def count_patches_1D(array):
patches = np.count_nonzero(array)
for i in np.nonzero(array)[0][:-1]:
if (array[i+1] != 0):
patches -= 1
return patches
I'm not sure if that method works for two dimensions as well. I haven't come up with a function for that case and I need some help for that.
Edit for clarification:
I would like to count connected patches in the 2-dimensional case, including diagonals. So an array [[1, 0], [1, 1]] would have one patch as well as [[1, 0], [0, 1]].
Also, I am wondering if there is a build-in python function for this.
The following should work:
import numpy as np
import copy
# create an array
A = np.array(
[
[0, 1, 1, 1, 0, 1],
[0, 0, 1, 0, 0, 0],
[1, 0, 0, 1, 0, 1],
[1, 0, 0, 0, 0, 1],
[0, 0, 1, 0, 0, 1]
]
)
def isadjacent(pos, newpos):
"""
Check whether two coordinates are adjacent
"""
# check for adjacent columns and rows
return np.all(np.abs(np.array(newpos) - np.array(pos)) < 2):
def count_patches(A):
"""
Count the number of non-zero patches in an array.
"""
# get non-zero coordinates
coords = np.nonzero(A)
# add them to a list
inipatches = list(zip(*coords))
# list to contain all patches
allpatches = []
while len(inipatches) > 0:
patch = [inipatches.pop(0)]
i = 0
# check for all points adjacent to the points within the current patch
while True:
plen = len(patch)
curpatch = patch[i]
remaining = copy.deepcopy(inipatches)
for j in range(len(remaining)):
if isadjacent(curpatch, remaining[j]):
patch.append(remaining[j])
inipatches.remove(remaining[j])
if len(inipatches) == 0:
break
if len(inipatches) == 0 or plen == len(patch):
# nothing added to patch or no points remaining
break
i += 1
allpatches.append(patch)
return len(allpatches)
print(f"Number of patches is {count_patches(A)}")
Number of patches is 5
This should work for arrays with any number of dimensions.

I was trying to use matrixes without libraries but I can't set the values correctly

def create_matrix(xy):
matrix = []
matrix_y = []
x = xy[0]
y = xy[1]
for z in range(y):
matrix_y.append(0)
for n in range(x):
matrix.append(matrix_y)
return matrix
def set_matrix(matrix,xy,set):
x = xy[0]
y = xy[1]
matrix[x][y] = set
return matrix
index = [4,5]
index_2 = [3,4]
z = create_matrix(index)
z = set_matrix(z,index_2, 12)
print(z)
output:
[[0, 0, 0, 0, 12], [0, 0, 0, 0, 12], [0, 0, 0, 0, 12], [0, 0, 0, 0, 12]]
This code should change only the last array
In your for n in range(x): loop you are appending the same y matrix multiple times. Python under the hood does not copy that array, but uses a pointer. So you have a row of pointers to the same one column.
Move the matrix_y = [] stuff inside the n loop and you get unique y arrays.
Comment: python does not actually have a pointer concept but it does use them. It hides from you when it does a copy data and when it only copies a pointer to that data. That's kind of bad language design, and it tripped you up here. So now you now that pointers exist, and that most of the time when you "assign arrays" you will actually only set a pointer.
Another comment: if you are going to be doing anything serious with matrices, you should really look into numpy. That will be many factors faster if you do numerical computations.
you don't need first loop in create_matrix, hide them with comment:
#for z in range(y):
# matrix_y.append(0)
change second one like this, it means an array filled with and length = y:
for n in range(x):
matrix.append([0] * y)
result (only last cell was changed in matrix):
z = set_matrix(z,index_2, 12)
print(z)
# [[0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 12]]

Backtracking pathinding problem in Python

Recently, I've found out about backtracking and without much thinking started on the book from the guy who has shown some Sudoku backtracking tricks (https://www.youtube.com/watch?v=G_UYXzGuqvM&ab_channel=Computerphile. Unfortunately, I'm stuck with the first backtracking problem without the solution.
The problem is formulated accordingly:
Use backtracking to calculate the number of all paths from the bottom left to the top right corner in a
x * y-grid. This includes paths like https://imgur.com/3t3Np4M. Note that every point can only be visited once. Write a function np(x,y) that returns the number of paths in a x*y-grid. E.g. np(2,3) should return 38. Hint: Create a grid of booleans where you mark the positions already visited.
Whatever I change in this short block of code I'm landing nowhere near 38.
```
grid = [[0, 0, 0, 0, 0, 1],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 0]]
solution = 0
def number_of_paths(x, y):
global solution
global grid
for i in range(0, x):
for j in range(0, y):
if grid[i][j] == 0:
grid[i][j] = 1
number_of_paths(x, y)
grid[i][j] = 0
solution += 1
return
if __name__ == '__main__':
number_of_paths(2, 3)
print(grid)
print(solution)```
That's a sample solution with solution with Sudoku solver.
```
grid = [[5, 3, 0, 0, 7, 0, 0, 0, 0],
[6, 0, 0, 1, 9, 5, 0, 0, 0],
[0, 9, 8, 0, 0, 0, 0, 6, 0],
[8, 0, 0, 0, 6, 0, 0, 0, 3],
[4, 0, 0, 8, 0, 3, 0, 0, 1],
[7, 0, 0, 0, 2, 0, 0, 0, 6],
[0, 6, 0, 0, 0, 0, 2, 8, 0],
[0, 0, 0, 4, 1, 9, 0, 0, 5],
[0, 0, 0, 0, 8, 0, 0, 7, 9]]
import numpy as np
def possible(y, x, n):
global grid
for i in range(0, 9):
if grid[y][i] == n:
return False
for i in range(0, 9):
if grid[i][x] == n:
return False
x0 = (x // 3) * 3
y0 = (y // 3) * 3
for i in range(0, 3):
for j in range(0, 3):
if grid[y0 + i][x0 + j] == n:
return False
return True
def solve():
global grid
for y in range(9):
for x in range(9):
if grid[y][x] == 0:
for n in range(1, 10):
if possible(y, x, n):
grid[y][x] = n
solve()
# backtracking - bad choice
grid[y][x] = 0
return
print(np,matrix(grid))
input("More?")```
A few suggestions:
You might want to use a set for a grid, adding a square as soon as it is visited, if it is not a member of the set yet.
The counter and the grid can be global but it would probably be easier for you to take them as arguments for the function at first. After the solution is clearer you can worry about those details.
You are going about the problem the wrong way. It would be good to have one function calculating the number of paths from the origin to the destination (by calling the function for the neighbors that have not been visited yet. Make sure you update the grid). On top of that you can have a function that calls the path function for every combination of origin and destination. A small tip: You do not have to calculate the same path in reverse direction! You can have a map of calculate sums of paths. If the opposite direction has been calculate, don't bother. Later, double the amount of paths by 2.
Good luck!
I will show you a solution on a coordinate system where (0,0) is the topleft and (maxY,maxX) is the bot right. Going right increases x and going down increases y.
1- If you are trying to solve the exact maze in the image, then your grid array shape is wrong. Notice that you are travelling between corners of the squares, there are 4 points you can be horizontally and 3 points you can be vertically.
2- Hint is telling you about using a boolean mask for visited state, you already have a grid array so a separate array is not necessary.
3- The main problem with your code is how you are progressing in the maze. The loop structure
for i in range(0, x):
for j in range(0, y):
does not make sense because when you are in a position (x, y), you can only move in 4 main directions (right, up, left, down). However this loops make it look like you are trying to branch into all positions behind you, which is not valid. In my code I will explicity show about this traverse stuff.
grid = [[0, 0, 0, 0],
[0, 0, 0, 0],
[1, 0, 0, 0]]
# number of solutions
solution = 0
# maximum values of x and y coordinates
maxX = len(grid[0])-1
maxY = len(grid)-1
# endpoint coordinates, top(y=0) right(x=maxX) of the maze
endX = maxX
endY = 0
# starting point coordinates, bottom(y=maxY) left(x=0) of the maze
mazeStartX = 0
mazeStartY = maxY
def number_of_paths(startX, startY):
global solution
global grid
global mask
# if we reached the goal, return at this point
if (startX == endX and startY == endY):
solution += 1
return
# possible directions are
#RIGHT (+1x, 0y)
#UP (0x, -1y)
#LEFT (-1x, 0y)
#DOWN (0x, +1y)
# I use a direction array like this to avoid nested ifs inside the for loop
dx = [1, 0, -1, 0]
dy = [0, -1, 0, 1]
for d in range(len(dx)):
newX = startX + dx[d]
newY = startY + dy[d]
# out of maze bounds
if (newX < 0 or newY < 0):
continue
# out of maze bounds
if (newX > maxX or newY > maxY):
continue
if (grid[newY][newX] == 1):
# this are is already visited
continue
else:
# branch from this point
grid[newY][newX] = 1
number_of_paths(newX, newY)
grid[newY][newX] = 0
if __name__ == '__main__':
number_of_paths(mazeStartX, mazeStartY)
print(grid)
print(solution)

How can "self" update original variable correctly? Recursion/Backtracking in the N-queens problem (Python)

This is my python program to solve the 8-queens problem. Everything is working except the final step of printing the solved board. I use recursion/backtracking to fill the board with queens until a solution is found. The board object that holds the solution is self, which is a reference to b1, so I assume that b1, the original board I initialized, would be updated to contain the final solved board, and would print the solution using printBoard. However, b1 is not being updated and is holding a failed board when I print it for some unknown reason.
edit: added placeQueen in solve
EMPTY = 0
QUEEN = 1
RESTRICTED = 2
class Board:
# initializes a 8x8 array
def __init__ (self):
self.board = [[EMPTY for x in range(8)] for y in range(8)]
# pretty prints board
def printBoard(self):
for row in self.board:
print(row)
# places a queen on a board
def placeQueen(self, x, y):
# restricts row
self.board[y] = [RESTRICTED for i in range(8)]
# restricts column
for row in self.board:
row[x] = RESTRICTED
# places queen
self.board[y][x] = QUEEN
self.fillDiagonal(x, y, 0, 0, -1, -1) # restricts top left diagonal
self.fillDiagonal(x, y, 7, 0, 1, -1) # restructs top right diagonal
self.fillDiagonal(x, y, 0, 7, -1, 1) # restricts bottom left diagonal
self.fillDiagonal(x, y, 7, 7, 1, 1) # restricts bottom right diagonal
# restricts a diagonal in a specified direction
def fillDiagonal(self, x, y, xlim, ylim, xadd, yadd):
if x != xlim and y != ylim:
self.board[y + yadd][x + xadd] = RESTRICTED
self.fillDiagonal(x + xadd, y + yadd, xlim, ylim, xadd, yadd)
# recursively places queens such that no queen shares a row or
# column with another queen, or in other words, no queen sits on a
# restricted square. Should solve by backtracking until solution is found.
def solve(self, col):
if col == -1:
return True
for i in range(8):
if self.board[i][col] == EMPTY:
temp = self.copy()
self.placeQueen(col, i)
if self.solve(col - 1):
return True
temp.board[i][col] = RESTRICTED
self = temp.copy()
return False
# deep copies a board onto another board
def copy(self):
copy = Board()
for i in range(8):
for j in range (8):
copy.board[j][i] = self.board[j][i]
return copy
b1 = Board()
b1.solve(7)
b1.printBoard()
I know that my actual solver is working, because when I add a printBoard like so:
if col == -1:
self.printBoard()
return True
in the solve method, a solved board is printed. In short, why is the self instance of a board not updating b1?
I believe your problem is related to redefining self in the solve method, andi'm not even sure why you're doing that.
See this question for more details: Is it safe to replace a self object by another object of the same type in a method?
Reassigning self like you're doing is not reassigning the "b1" reference. So when you reference b1 again and do printBoard, you're referencing a different object than what "self.printBoard()" will be referencing by the time solve is done.
I would step back and ask yourself why you're replacing self to begin with, and what this gains you. You likely don't need too and shouldn't be doing it either.
I'm not sure how this works since placeQueen is never called. As such, I don't see that adding a print as suggested presents a finished board (I see it as empty). [note: the latest update fixes this]
Using the restricted squares idea could work, but the way it's implemented here (without an undo option) is inefficient; copying a whole new Board object for every inner loop is very expensive. For all the trouble, we could just as well perform an iterative conflict check per move which at least saves the allocation and garbage collection costs of a new heap object.
As far as returning the completed board result, use a return value of self or self.board and None on failure rather than True and False.
A few other points:
Since solving a puzzle doesn't require state and we can (hopefully) agree that copying the board is inefficient, I'm not sure if there's much point in allowing an __init__ method. The class is nice as an encapsulation construct and we should hide static variables like EMPTY, QUEEN, etc inside the Board class regardless of whether the class is static or instantiated.
If you do decide to keep the class stateful, printBoard should not produce side effects--override __str__ instead.
Don't hardcode size literals such as 8 throughout the code; this makes the class rigid, difficult to maintain and prone to typos and off-by-one errors. Use len(self.board) instead and provide parameters liberally.
fillDiagonal doesn't need to be recursive. Consider using list comprehensions or numpy to simplify this matrix traversal logic.
Use snake_case variable names and docstrings instead of hashtag comments per PEP-8. If you feel compelled to write a comment like # restricts column, consider moving the relevant chunk to a function called restrict_column(...) and skip the comment.
Here's an initial rewrite that implements a few of these points:
class Board:
EMPTY = 0
QUEEN = 1
DIRS = [(x, y) for x in range(-1, 2) for y in range(-1, 2) if x]
def __init__ (self, size=8):
self.board = [[Board.EMPTY] * size for _ in range(size)]
def __str__(self):
return "\n".join(map(str, self.board))
def legal_from(self, row, col, dr, dc):
while row >= 0 and row < len(self.board) and \
col >= 0 and col < len(self.board[row]):
if self.board[row][col] != Board.EMPTY:
return False
row += dr; col += dc
return True
def legal_move(self, row, col):
return all([self.legal_from(row, col, *d) for d in Board.DIRS])
def solve(self, row=0):
if row >= len(self.board):
return self
for col in range(len(self.board[row])):
if self.legal_move(row, col):
self.board[row][col] = Board.QUEEN
if self.solve(row + 1):
return self
self.board[row][col] = Board.EMPTY
if __name__ == "__main__":
for result in [Board(i).solve() for i in range(9)]:
print(result, "\n")
Output:
[1]
None
None
[0, 1, 0, 0]
[0, 0, 0, 1]
[1, 0, 0, 0]
[0, 0, 1, 0]
[1, 0, 0, 0, 0]
[0, 0, 1, 0, 0]
[0, 0, 0, 0, 1]
[0, 1, 0, 0, 0]
[0, 0, 0, 1, 0]
[0, 1, 0, 0, 0, 0]
[0, 0, 0, 1, 0, 0]
[0, 0, 0, 0, 0, 1]
[1, 0, 0, 0, 0, 0]
[0, 0, 1, 0, 0, 0]
[0, 0, 0, 0, 1, 0]
[1, 0, 0, 0, 0, 0, 0]
[0, 0, 1, 0, 0, 0, 0]
[0, 0, 0, 0, 1, 0, 0]
[0, 0, 0, 0, 0, 0, 1]
[0, 1, 0, 0, 0, 0, 0]
[0, 0, 0, 1, 0, 0, 0]
[0, 0, 0, 0, 0, 1, 0]
[1, 0, 0, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 1, 0, 0, 0]
[0, 0, 0, 0, 0, 0, 0, 1]
[0, 0, 0, 0, 0, 1, 0, 0]
[0, 0, 1, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 0, 0, 1, 0]
[0, 1, 0, 0, 0, 0, 0, 0]
[0, 0, 0, 1, 0, 0, 0, 0]

Python: Generating from geometric distribution

Is this best way or most efficient way to generate random numbers from a geometric distribution with an array of parameters that may contain 0?
allids["c"]=[2,0,1,1,3,0,0,2,0]
[ 0 if x == 0 else numpy.random.geometric(1./x) for x in allids["c"]]
Note I am somewhat concerned about optimization.
EDIT:
A bit of context: I have an sequence of characters (i.e. ATCGGGA) and I would like to expand/contract runs of a single character (i.e. if original sequence had a run of 2 'A's I want to simulate a sequence that will have an expected value of 2 'A's, but vary according to a geometric distribution). All the characters that are runs of length 1 I do NOT want to be of variable length.
So if
seq = 'AATCGGGAA'
allids["c"]=[2,0,1,1,3,0,0,2,0]
rep=[ 0 if x == 0 else numpy.random.geometric(1./x) for x in allids["c"]]
"".join([s*r for r, s in zip(rep, seq)])
will output (when rep is [1, 0, 1, 1, 3, 0, 0, 1, 0])
"ATCGGGA"
You can use a masked array to avoid the division by zero.
import numpy as np
a = np.ma.masked_equal([2, 0, 1, 1, 3, 0, 0, 2, 0], 0)
rep = np.random.geometric(1. / a)
rep[a.mask] = 0
This generates a random sample for each element of a, and then deletes some of them later. If you're concerned about this waste of random numbers, you could generate just enough, like so:
import numpy as np
a = np.ma.masked_equal([2, 0, 1, 1, 3, 0, 0, 2, 0], 0)
rep = np.zeros(a.shape, dtype=int)
rep[~a.mask] = np.random.geometric(1. / a[~a.mask])
What about this:
counts = array([2, 0, 1, 1, 3, 0, 0, 2, 0], dtype=float)
counts_ma = numpy.ma.array(counts, mask=(counts == 0))
counts[logical_not(counts.mask)] = \
array([numpy.random.geometric(v) for v in 1.0 / counts[logical_not(counts.mask)]])
You could potentially precompute the distribution of homopolymer runs and limit the number of calls to geometric as fetching large numbers of values from RNGs is more efficient than individual calls

Categories

Resources