I'm coding my first genetic algorithm in Python.
I particularly care about the optimization and population scalability.
import numpy as np
population = np.random.randint(-1, 2, size=(10,10))
Here I make a [10,10] array, with random number between -1 and 1.
And now I want to perform a specific mutation ( mutation rate depends on the specimens fitness ) for each specimen of my array.
For example, I have:
print population
[[ 0 0 1 1 -1 1 1 0 1 0]
[ 0 1 -1 -1 0 1 -1 -1 0 -1]
[ 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0]
[ 0 1 1 0 0 0 1 1 0 1]
[ 1 -1 0 0 1 0 -1 1 1 0]
[ 1 -1 1 -1 0 -1 0 0 1 1]
[ 0 0 0 0 0 0 0 0 0 0]
[ 0 1 1 0 0 0 1 -1 1 0]]
I want to perform the mutation of this array with a specific mutation rate for each sub-array in population. I try this but the optimization is not perfect and I need to perform a different mutation for each sub-array (each sub-array is a specimen) in the population (the main array, "population").
population[0][numpy.random.randint(0, len(population[0]), size=10/2)] = np.random.randint(-1, 2, size=10/2)
I'm looking for a way to apply something like a mutation mask on all the main-array. Something like that:
population[array element select with the mutation rate] = random_array[with the same size]
I think it's the best way (because we only to an array selection and after we replace this selection with the random array), but I don't know how to perform this. And if you have other solution I am on it ^^.
Let's say you have an array fitness with the fitness of each specimen, with size len(population). Let's also say you have a function fitness_mutation_prob that, for a given fitness, gives you the mutation probability for each of the elements in the specimen. For example, if the values of fitness range from 0 to 1, fitness_mutation_prob(fitness) could be something like (1 - fitness), or np.square(1 - fitness), or whatever. You can then do:
r = np.random.random(size=population.shape)
mut_probs = fitness_mutation_prob(fitness)
m = r < mut_probs[:, np.newaxis]
population[m] = np.random.randint(-1, 2, size=np.count_nonzero(m))
Related
I am attempting to build a small program that will help forecast future demand based on some inputs. Three key pieces of input are demand, on-hand, in-transit. The demand is passed as a list of 52 elements and the in-transit array would need to match it's length The problem I'm running into is initializing the in-transit data.
Demand data looks like this:
d = [100, 221, 470, 100, 250,...]
For the program to properly forecast, I need to pass the in-transit data in over a 52 week period. If I only have inbound inventory in week 3 for example, my data would look like this:
transit = [0, 0, 378,...]
Is there a way that I can pass this data into a numpy array and feed this to the program? Currently I'm using np.zero to initialize but that would only work if I didn't have inventory scheduled to arrive.
Code Snippet
# Determine the starting on hand and transit arrays
hand = np.zeros(time, dtype=int)
transit = np.zeros((time,L+1), dtype=int)
What the beginning array outputs when initialized with np.zero:
[[ 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 6429]
[ 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0]]
Given a matrix M n*n (containing only 0 and 1), I want to build the matrix that contains a 1 in position (i, j) if and only if there is at least a 1 in the bottom-right submatrix M[i:n, j:n]
Please note that I know there are optimal algorithm to compute this, but for performance reasons, I'm looking for a solution using numpy (so the algorithm is fully compiled)
Example:
Given this matrix:
0 0 0 0 1
0 0 1 0 0
0 0 0 0 1
1 0 1 0 0
I'm looking for a way to compute this matrix:
0 0 0 0 1
0 0 1 1 1
0 0 1 1 1
1 1 1 1 1
Thanks
Using numpy, you can accumulate the maximum value over each axis:
import numpy as np
M = np.array([[0,0,0,0,1],
[0,0,1,0,0],
[0,0,0,0,1],
[1,0,1,0,0]])
M = np.maximum.accumulate(M)
M = np.maximum.accumulate(M,axis=1)
print(M)
[[0 0 0 0 1]
[0 0 1 1 1]
[0 0 1 1 1]
[1 1 1 1 1]]
Note: This matches your example result (presence of 1 in top-left quadrant). Your explanations of the logic would produce a different result however
If we go with M[i:n,j:n] (bottom-right):
M = np.array([[0,0,0,0,1],
[0,0,1,0,0],
[0,0,0,0,1],
[1,0,1,0,0]])
M = np.maximum.accumulate(M[::-1,:])[::-1,:]
M = np.maximum.accumulate(M[:,::-1],axis=1)[:,::-1]
print(M)
[[1 1 1 1 1]
[1 1 1 1 1]
[1 1 1 1 1]
[1 1 1 0 0]]
It is essentially the same approach except with reversed accumulation on the axes
I have a two-dimensional numpy array like:
[[0 0 0 0 0 0 0 0 1 1]
[0 0 0 1 0 1 0 0 0 1]
[1 0 1 0 0 0 1 0 0 1]
[1 0 0 0 0 0 0 0 1 0]
[0 1 0 0 0 1 0 1 1 0]
[0 0 0 1 1 0 0 0 0 0]
[0 1 1 1 1 1 0 0 0 0]
[1 0 0 0 1 0 1 0 0 0]
[0 0 0 0 0 0 0 1 0 0]
[0 1 0 0 0 0 0 0 0 0]]
We can think of it as a map that is viewed from above.
I'll pick a random cell, let's say line 3 column 4 (start counting at 0). If the cell contains a 1, there is no problem. If the cell is a 0, I need to find the index of the nearest 1.
Here, line 3 column 4 is a 0, I want a way to find the nearest 1 which is line 4 column 5.
If two cells containing 1 are at the same distance, I don't care which one I get.
Borders are not inter-connected, i.e. the nearest 1 for the cell line 7 column 9 is not the 1 line 7 column 0
Of course it is a simplified example of my problem, my actual np arrays do not contain zeros and ones but rather Nones and floats
This is a simple "path-finding" problem. Prepare an empty queue of coordinates and push a starting position to the queue. Then, pop the first element from the queue and check location and if it's 1 return the coordinates, otherwise push all neighbours to the queue and repeat.
ADJACENT = [(0, 1), (1, 0), (0, -1), (-1, 0)]
def find(data: np.array, start: tuple):
queue = deque()
deque.append(start)
while queue:
pos = queue.popleft()
if data[pos[0], pos[1]]:
return position
else:
for dxy in ADJACENT:
(x, y) = (pos[0] + dxy[0], pos[1], dxy[1])
if x >= 0 and x < data.size[0] and y >= and y < data.size[1]:
queue.append((x,y))
return None
I am simulating protein folding on a 2D grid where every angle is either ±90° or 0°, and have the following problem:
I have an n-by-n numpy array filled with zeros, except for certain places where the value is any integer from 1 to n. Every integer appears just once. Integer k is always a nearest neighbour to k-1 and k + 1, except for the endpoints. The array is saved as an object in the class Grid which I have created for doing energy calculations and folding the protein. Example array, with n=5:
>>> from Grid import Grid
>>> a = Grid(5)
>>> a.show()
[[0 0 0 0 0]
[0 0 0 0 0]
[1 2 3 4 5]
[0 0 0 0 0]
[0 0 0 0 0]]
My goal is to find the longest consecutive line of non-zero elements withouth any bends. In the above case, the result should be 5.
My idea so far are something like this:
def getDiameter(self):
indexes = np.zeros((self.n, 2))
for i in range(1, self.n + 1):
indexes[i - 1] = np.argwhere(self.array == i)[0]
for i in range(self.n):
j = 1
currentDiameter = 1
while indexes[0][i] == indexes[0][i + j] and i + j <= self.n:
currentDiameter += 1
j += 1
while indexes[i][0] == indexes[i + j][0] and i + j <= self.n:
currentDiameter += 1
j += 1
if currentDiameter > diameter:
diameter = currentDiameter
return diameter
This has two problems: (1) it doesn't work, and (2) it is horribly inefficient if I get it to work. I am wondering if anybody has a better way of doing this. If anything is unclear, please let me know.
Edit:
Less trivial example
[[ 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 10 0 0 0]
[ 0 0 0 0 0 0 9 0 0 0]
[ 0 0 0 0 0 0 8 0 0 0]
[ 0 0 0 4 5 6 7 0 0 0]
[ 0 0 0 3 0 0 0 0 0 0]
[ 0 0 0 2 1 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0]]
The correct answer here is 4 (both the longest column and the longest row have four non-zero elements).
What I understood from your question is you need to find the length of longest occurance of consecutive elements in numpy array (row by row).
So for this below one, the output should be 5:
[[1 2 3 4 0]
[0 0 0 0 0]
[10 11 12 13 14]
[0 1 2 3 0]
[1 0 0 0 0]]
Because [10 11 12 13 14] are consecutive elements and they have the longest length comparing to any consecutive elements in any other row.
If this is what you are expecting, consider this:
import numpy as np
from itertools import groupby
a = np.array([[1, 2, 3, 4, 0],
[0, 0, 0, 0, 0],
[10, 11, 12, 13, 14],
[0, 1, 2, 3, 0],
[1, 0, 0, 0, 0]])
a = a.astype(float)
a[a == 0] = np.nan
b = np.diff(a) # Calculate the n-th discrete difference. Consecutive numbers will have a difference of 1.
counter = []
for line in b: # for each row.
if 1 in line: # consecutive elements differ by 1.
counter.append(max(sum(1 for _ in g) for k, g in groupby(line) if k == 1) + 1) # find the longest length of consecutive 1's for each row.
print(max(counter)) # find the max of list holding the longest length of consecutive 1's for each row.
# 5
For your particular example:
[[0 0 0 0 0]
[0 0 0 0 0]
[1 2 3 4 5]
[0 0 0 0 0]
[0 0 0 0 0]]
# 5
Start by finding the longest consecutive occurrence in a list:
def find_longest(l):
counter = 0
counters =[]
for i in l:
if i == 0:
counters.append(counter)
counter = 0
else:
counter += 1
counters.append(counter)
return max(counters)
now you can apply this function to each row and each column of the array, and find the maximum:
longest_occurrences = [find_longest(row) for row in a] + [find_longest(col) for col in a.T]
longest_occurrence = max(longest_occurrences)
I have a 3D image with size: Deep x Weight x Height (for example: 10x20x30, means 10 images, and each image has size 20x30.
Given a patch size is pd x pw x ph (such as pd <Deep, pw<Weight, ph<Height), for example patch size: 4x4x4. The center point location of the path will be: pd/2 x pw/2 x ph/2. Let's call the distance between time t and time t+1 of the center point be stride, for example stride=2.
I want to extract the original 3D image into patches with size and stride given above. How can I do it in python? Thank you
.
Use np.lib.stride_tricks.as_strided. This solution does not require the strides to divide the corresponding dimensions of the input stack. It even allows for overlapping patches (Just do not write to the result in this case, or make a copy.). It therefore is more flexible than other approaches:
import numpy as np
from numpy.lib import stride_tricks
def cutup(data, blck, strd):
sh = np.array(data.shape)
blck = np.asanyarray(blck)
strd = np.asanyarray(strd)
nbl = (sh - blck) // strd + 1
strides = np.r_[data.strides * strd, data.strides]
dims = np.r_[nbl, blck]
data6 = stride_tricks.as_strided(data, strides=strides, shape=dims)
return data6#.reshape(-1, *blck)
#demo
x = np.zeros((5, 6, 12), int)
y = cutup(x, (2, 2, 3), (3, 3, 5))
y[...] = 1
print(x[..., 0], '\n')
print(x[:, 0, :], '\n')
print(x[0, ...], '\n')
Output:
[[1 1 0 1 1 0]
[1 1 0 1 1 0]
[0 0 0 0 0 0]
[1 1 0 1 1 0]
[1 1 0 1 1 0]]
[[1 1 1 0 0 1 1 1 0 0 0 0]
[1 1 1 0 0 1 1 1 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0]
[1 1 1 0 0 1 1 1 0 0 0 0]
[1 1 1 0 0 1 1 1 0 0 0 0]]
[[1 1 1 0 0 1 1 1 0 0 0 0]
[1 1 1 0 0 1 1 1 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0]
[1 1 1 0 0 1 1 1 0 0 0 0]
[1 1 1 0 0 1 1 1 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0]]
Explanation. Numpy arrays are organised in terms of strides, one for each dimension, data point [x,y,z] is located in memory at address base + stridex * x + stridey * y + stridez * z.
The stride_tricks.as_strided factory allows to directly manipulate the strides and shape of a new array sharing its memory with a given array. Try this only if you know what you're doing because no checks are performed, meaning you are allowed to shoot your foot by addressing out-of-bounds memory.
The code uses this function to split up each of the three existing dimensions into two new ones, one for the corresponding within block coordinate (this will have the same stride as the original dimension, because adjacent points in a block corrspond to adjacent points in the whole stack) and one dimension for the block index along this axis; this will have stride = original stride x block stride.
All the code does is computing the correct strides and dimensions (= block dimensions and block counts along the three axes).
Since the data are shared with the original array, when we set all points of the 6d array to 1, they are also set in the original array exposing the block structure in the demo. Note that the commented out reshape in the last line of the function breaks this link, because it forces a copy.
the skimage module offer you an integrated solution with view_as_blocks.
The source is on line.
Take care to choose Deep,Weight,Height multiple of pd, pw, ph, because as_strided do not check bounds.