Linear Programming in 'scipy.linprog' with non-standard functions - python

I am trying to learn about implementation of linear programming (LP) problems in scipy.linprog. I understand how it works with basic functions, for example:
max 2x+3y
st. 2x-y <= 0
5x+y >= -10
from scipy.optimize import linprog
c = [-2, -3]
A = [[2, -1], [-5, -1]]
b = [0, 10]
Now, when it comes to more complex cases, like this there are quite a few things I don't understand.
First, I don't understand how and why Minimize 1*abs(x_0) + 1*abs(x_1) + ... + 1*abs(x_n)) is changed to Minimize 1*y_0 + 1*y_1 + ... + 1*y_n by adding the following constraint -y_i <= x_i <= y_i // for i in n. EDIT: I understand now that y_i is a slack variable, which also is the reason that the A_aux matrix is added. But I don't understand that shape of the Matrix.
Second, here they use the constraint to c = np.hstack((np.zeros(N), np.ones(N_AUX))). In my example above, there are two variables, which gives a list of the length 2. But here there are N variables, but c is a list of N*2? In my example it is easy to understand the values, but here they are [0,0,0,0,0,1,1,1,1,1] and I don't see any hint in the function to why these values are chosen.
Third, I understand the A_orig matrix.
A_orig = [[0, 1, -1, 0, 0, 0, 0, 0, 0, 0], # orig constraint 1
[0, 0, 1, -1, 0, 0, 0, 0, 0, 0], # orig constraint 2
[-1, -1, 0, 0, 0, 0, 0, 0, 0, 0], # more interesting problem
[0, -1, -1, 0, 0, 0, 0, 0, 0, 0]] # "" "" ""
Each row simply represents the number of x_i and y_i for i in range(1,6) for every constraint.
But then a second matrix A_aux is added:
A_aux = [[-1, 0, 0, 0, 0, -1, 0, 0, 0, 0],
[0, -1, 0, 0, 0, 0, -1, 0, 0, 0],
[0, 0, -1, 0, 0, 0, 0, -1, 0, 0],
[0, 0, 0, -1, 0, 0, 0, 0, -1, 0],
[0, 0, 0, 0, -1, 0, 0, 0, 0, -1],
[1, 0, 0, 0, 0, -1, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, -1, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0, -1, 0, 0],
[0, 0, 0, 1, 0, 0, 0, 0, -1, 0],
[0, 0, 0, 0, 1, 0, 0, 0, 0, -1]]
Which is a skew-symmetric matrix. And I have no idea and can't find any information about why this is added. I don't understand why it is 10*10 either.
Finally, bounds are added. bnds = [(0, 50) for i in range(N)] + [(None, None) for i in range(N_AUX)] # some custom bounds.
Here I find it strange that they set (0,50) for the first 5 rows, and then no bounds for the last 5 rows. I don't see any indication to why it should be bounded by 50?
There are quite a lot of questions here. But if someone could point me in the right direction I would be very grateful.
Thank you.

This is rather a math question than a programming question. Anyway, to answer your other questions:
In my example above, there are two variables, which gives a list of the length 2. But here there are N variables, but c is a list of N*2?
You need to introduce an additional variable y_i for each abs(x_i). You have N terms abs(x_i), therefore you'll have N+N = 2*N variables after linearizing/reformulating all the absolute values. Each variable x doesn't occurs in the objective in the reformulation, therefore each coefficient of x_i is zero in the objective. And obviously, each one of y_i is 1.
Which is a skew-symmetric matrix. And I have no idea and can't find any information about why this is added.
You need the matrix A_aux to model the constraints -y_i <= x_i <= y_i. Both y_i and x_i are optimization variables, so you can't pass this constraints as variable bounds. This only works if the bounds are given constants like -1 <= x_i = 2.
Let's say you have the constraints -y_1 <= x_1 <= y_1 and -y_2 <= x_2 <= y_2. This is the same as:
-x_1 - y_1 <= 0
x_1 - y_1 <= 0
-x_2 - y_2 <= 0
x_2 - y_2 <= 0
Adding the hidden zeros, this is the same as
-1*x_1 + 0*x_2 - 1*y_1 + 0*y_2 <= 0
1*x_1 + 0*x_2 - 1*y_1 + 0*y_2 <= 0
0*x_1 - 1*x_2 + 0*y_1 - 1*y_2 <= 0
0*x_0 + 1*x_2 + 0*y_1 - 1*y_2 <= 0
Can you see it now? It's just a simply matrix vector product:
(-1 0 -1 0 ) (x_1) <= 0
( 1 0 -1 0 ) (x_2) <= 0
( 0 -1 0 -1 ) (y_1) <= 0
( 0 1 0 -1 ) (y_2) <= 0
Here, the left matrix is just A_aux. If you change the row order, you'll get exactly the same matrix as in your linked answer.
Finally, bounds are added. bnds = [(0, 50) for i in range(N)] + [(None, None) for i in range(N_AUX)] # some custom bounds. Here I find it strange that they set (0,50) for the first 5 rows, and then no bounds for the last 5 rows. I don't see any indication to why it should be bounded by 50?
The first N variables (x) are non-negative after the reformulation. The N addition variables y are unbounded, i.e. can be negative or positive, there there are no bounds. However, there's no real need to bound the x variables with 50 from above. It's only meant to be a sharper bound of the optimization variables based on the right-hand side of the other linear constraints in the example.
PS: I can second Sascha's comment in your linked answer, that you should prefer a library like cvxpy if you don't want to do all these transformations/reformulations on your own.

Related

A-star (A*) search algorithm on labyrinth matrix in python [duplicate]

This question already has an answer here:
A star algorithm: Distance heuristics
(1 answer)
Closed 2 years ago.
I have a labyrinth matrix for a maze problem.
Labyrinth =
[[0, 0, 0, 0, 0, 0, 1, 0],
[0, 1, 0, 1, 1, 1, 1, 0],
[0, 1, 1, 1, 0, 1, 0, 0],
[0, 1, 0, 0, 0, 0, 0, 0],
[0, 1, 1, 0, 1, 1, 3, 0],
[0, 0, 1, 1, 1, 0, 0, 0],
[0, 1, 2, 0, 1, 1, 1, 0],
[0, 1, 0, 0, 0, 0, 0, 0]]
Here,
0 represents a blocked cell that is a wall
1 represents an empty cell
2 and 3 represents starting and ending points respectively.
I need a function which can return the path from point 2 to 3 after performing an A* Search Algorithm using Manhattan distance as distance estimate and length of the current path as path-cost.
Any Pointers? or tip/clue how I should operate on this one?
Update: I want to return path from begin to end by marking the path with some other character like X. For reference, this:
Labyrinth =
[[0, 0, 0, 0, 0, 0, 1, 0],
[0, 1, 0, 1, 1, 1, 1, 0],
[0, 1, 1, 1, 0, 1, 0, 0],
[0, 1, 0, 0, 0, 0, 0, 0],
[0, 1, 1, 0, X, X, 3, 0],
[0, 0, X, X, X, 0, 0, 0],
[0, 1, 2, 0, 1, 1, 1, 0],
[0, 1, 0, 0, 0, 0, 0, 0]]
Classical search algorithm work using a set of states called the fringe and a set of visited states:
the fringe is all the set that are yet to eplore hoping to find the goal state
the visited set is all the states that have already been visited to avoid visiting them again
The idea of A* is to explore the state in the fringe that has a minimal value of cost (defined as the sum of the heuristic cost and the progression cost (computed by all the state you had to pass by before)). You can find generic implementation of this algorithm on the wikipedia page for A* search algorithm. In your case a state may consist in :
the i, j position in the grid
the previous state (assuming None for the first state)
the total cost of this state (heuristic + path cost).
To explore a set you only need to check the direct neighbors of the cell (including only the one where the value is one). It is worth noting that in the visited set you should only include the position (i,j) and the cost (as you may re-enter this state if you found a shorter path, even if it is unlikely in your problem).
Here is an example that works for your case (but may be generalized easily):
def astar(lab):
# first, let's look for the beginning position, there is better but it works
(i_s, j_s) = [[(i, j) for j, cell in enumerate(row) if cell == 2] for i, row in enumerate(lab) if 2 in row][0][0]
# and take the goal position (used in the heuristic)
(i_e, j_e) = [[(i, j) for j, cell in enumerate(row) if cell == 3] for i, row in enumerate(lab) if 3 in row][0][0]
width = len(lab[0])
height = len(lab)
heuristic = lambda i, j: abs(i_e - i) + abs(j_e - j)
comp = lambda state: state[2] + state[3] # get the total cost
# small variation for easier code, state is (coord_tuple, previous, path_cost, heuristic_cost)
fringe = [((i_s, j_s), list(), 0, heuristic(i_s, j_s))]
visited = {} # empty set
# maybe limit to prevent too long search
while True:
# get first state (least cost)
state = fringe.pop(0)
# goal check
(i, j) = state[0]
if lab[i][j] == 3:
path = [state[0]] + state[1]
path.reverse()
return path
# set the cost (path is enough since the heuristic won't change)
visited[(i, j)] = state[2]
# explore neighbor
neighbor = list()
if i > 0 and lab[i-1][j] > 0: #top
neighbor.append((i-1, j))
if i < height and lab[i+1][j] > 0:
neighbor.append((i+1, j))
if j > 0 and lab[i][j-1] > 0:
neighbor.append((i, j-1))
if j < width and lab[i][j+1] > 0:
neighbor.append((i, j+1))
for n in neighbor:
next_cost = state[2] + 1
if n in visited and visited[n] >= next_cost:
continue
fringe.append((n, [state[0]] + state[1], next_cost, heuristic(n[0], n[1])))
# resort the list (SHOULD use a priority queue here to avoid re-sorting all the time)
fringe.sort(key=comp)

Calling a function within function does not work as expected?

I'm designing a maze generator in python and have various functions for different steps of the process. (I know the code can most definitely be improved but I'm just looking for an answer to my problem first before I work on optimizing it)
the first function generates a base maze in the form of a 2D list and works as expected:
def base_maze(dimension):
num_rows = int((2 * dimension[1]) + 1) #number of rows / columns
num_columns = int((2 * dimension[0]) + 1) #from tuple input
zero_row = [] #initialise a row of 0s
for i in range(num_columns):
zero_row.append(0)
norm_row = [] #initialise a row of
for i in range(num_columns // 2): #alternating 0s and 1s
norm_row.extend([0,1])
norm_row.append(0)
maze = [] #initialise maze
#(combination of zero rows
for i in range(num_rows // 2): # and normal rows)
maze.append(zero_row)
maze.append(norm_row)
maze.append(zero_row)
return maze
Another function gets the neighbors of the selected cell, and also works as expected:
def get_neighbours(cell, dimension):
y = cell[0] #set x/y values
max_y = dimension[0] - 1 #for reference
x = cell[1]
max_x = dimension[1] - 1
n = (x, y-1) #calculate adjacent
e = (x+1, y) #coordinates
s = (x, y+1)
w = (x-1, y)
if y > max_y or y < 0 or x > max_x or x < 0: #check if x/y
raise IndexError("Cell is out of maze bounds") #in bounds
neighbours = []
if y > 0: #add cells to list
neighbours.append(n) #if they're valid
if x < max_x: #cells inside maze
neighbours.append(e)
if y < max_y:
neighbours.append(s)
if x > 0:
neighbours.append(w)
return neighbours
the next function removes the wall between two given cells:
def remove_wall(maze, cellA, cellB):
dimension = []
x_dim = int(((len(maze[0]) - 1) / 2)) #calc the dimensions
y_dim = int(((len(maze) - 1) / 2)) #of maze matrix (x,y)
dimension.append(x_dim)
dimension.append(y_dim)
A_loc = maze[2*cellA[1]-1][2*cellA[0]-1]
B_loc = maze[2*cellB[1]-1][2*cellB[0]-1]
if cellB in get_neighbours(cellA, dimension): #if cell B is a neighbour
if cellA[0] == cellB[0] and cellA[1] < cellB[1]: #if the x pos of A is equal
adj_wall = maze[(2*cellA[0]+1)][2*cellA[1]+1+1] = 1 #to x pos of cell B and the y pos
#of A is less than B (A is below B)
elif cellA[0] == cellB[0] and cellA[1] > cellB[1]: #the adjacent wall is set to 1 (removed)
adj_wall = maze[(2*cellA[0]+1)][2*cellA[1]+1-1] = 1
#same is done for all other directions
if cellA[1] == cellB[1] and cellA[0] < cellB[0]:
adj_wall = maze[(2*cellA[0]+1)+1][(2*cellA[1]+1)] = 1
elif cellA[1] == cellB[1] and cellA[0] > cellB[0]:
adj_wall = maze[(2*cellA[0]+1-1)][(2*cellA[1]+1)] = 1
return maze
yet when I try to put these functions together into one final function to build the maze, they do not work as they work on their own, for example:
def test():
maze1 = base_maze([3,3])
maze2 = [[0, 0, 0, 0, 0, 0, 0], [0, 1, 0, 1, 0, 1, 0], [0, 0, 0, 0, 0, 0, 0], [0, 1, 0, 1, 0, 1, 0], [0, 0, 0, 0, 0, 0, 0], [0, 1, 0, 1, 0, 1, 0], [0, 0, 0, 0, 0, 0, 0]]
if maze1 == maze2:
print("they are exactly the same")
else:
print("WHY ARE THEY DIFFERENT???")
remove_wall(maze1,(0,0),(0,1))
remove_wall(maze2,(0,0),(0,1))
these will produce different results despite the input being exactly the same?:
test()
they are exactly the same
[[0, 0, 0, 0, 0, 0, 0], [0, 1, 1, 1, 0, 1, 0], [0, 0, 0, 0, 0, 0, 0], [0, 1, 1, 1, 0, 1, 0], [0, 0, 0, 0, 0, 0, 0], [0, 1, 1, 1, 0, 1, 0], [0, 0, 0, 0, 0, 0, 0]]
[[0, 0, 0, 0, 0, 0, 0], [0, 1, 1, 1, 0, 1, 0], [0, 0, 0, 0, 0, 0, 0], [0, 1, 0, 1, 0, 1, 0], [0, 0, 0, 0, 0, 0, 0], [0, 1, 0, 1, 0, 1, 0], [0, 0, 0, 0, 0, 0, 0]]
The problem is in your base_maze function, where you first create two types of row:
zero_row = [] #initialise a row of 0s
for i in range(num_columns):
zero_row.append(0)
norm_row = [] #initialise a row of
for i in range(num_columns // 2): #alternating 0s and 1s
norm_row.extend([0,1])
norm_row.append(0)
This is fine so far and works as expected, however when you build the maze from there
for i in range(num_rows // 2): # and normal rows)
maze.append(zero_row)
maze.append(norm_row)
maze.append(zero_row)
You are filling up the maze list with multiple instances of the same list. This means if you modify row 0 of the maze, row 2 & 4 will also be affected. To illustrate:
>>> def print_maze(maze):
... print('\n'.join(' '.join(str(x) for x in row) for row in maze))
...
>>> print_maze(maze)
0 0 0 0 0
0 1 0 1 0
0 0 0 0 0
0 1 0 1 0
0 0 0 0 0
>>> maze[0][0] = 3
>>> print_maze(maze)
3 0 0 0 0
0 1 0 1 0
3 0 0 0 0
0 1 0 1 0
3 0 0 0 0
Note that rows 0, 2, & 4 have all changed. This is because maze[0] is the same zero_row instance as maze[2] and maze[4].
Instead, when you create the maze you want to use a copy of the row lists. This can be done easily in Python using the following slicing notation
for i in range(num_rows // 2):
maze.append(zero_row[:]) # note the [:] syntax for copying a list
maze.append(norm_row[:])
maze.append(zero_row[:])

Simultaneous changing of python numpy array elements

I have a vector of integers from range [0,3], for example:
v = [0,0,1,2,1,3, 0,3,0,2,1,1,0,2,0,3,2,1].
I know that I can replace a specific values of elements in the vector by other value using the following
v[v == 0] = 5
which changes all appearences of 0 in vector v to value 5.
But I would like to do something a little bit different - I want to change all values of 0 (let's call them target values) to 1, and all values different from 0 to 0, thus I want to obtain the following:
v = [1,1,0,0,0,0,1,0,1,0,0,0,1,0,1,0,0,0]
However, I cannot call the substitution code (which I used above) as follows:
v[v==0] = 1
v[v!=0] = 0
because this obviously leeds to a vector of zeros.
Is it possible to do the above substitution in a parralel way, to obtain the desired vector? (I want to have a universal technique, which will allow me to use it even if I will change what is my target value). Any suggestions will be very helpful!
You can check if v is equal to zero and then convert the boolean array to int, and so if the original value is zero, the boolean is true and converts to 1, otherwise 0:
v = np.array([0,0,1,2,1,3, 0,3,0,2,1,1,0,2,0,3,2,1])
(v == 0).astype(int)
# array([1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0])
Or use numpy.where:
np.where(v == 0, 1, 0)
# array([1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0])

Slicing different rows of a numpy array differently

I'm working on a Monte Carlo radiative transfer code, which simulates firing photons through a medium and statistically modelling their random walk. It runs slowly firing one photon at a time, so I'd like to vectorize it and run perhaps 1000 photons at once.
I have divided my slab through which the photons are passing into nlayers slices between optical depth 0 and depth. Effectively, that means that I have nlayers + 2 regions (nlayers plus the region above the slab and the region below the slab). At each step, I have to keep track of which layers each photon passes through.
Let's suppose that I already know that two photons start in layer 0. One takes a step and ends up in layer 2, and the other takes a step and ends up in layer 6. This is represented by an array pastpresent that looks like this:
[[ 0 2]
[ 0 6]]
I want to generate an array traveled_through with (nlayers + 2) columns and 2 rows, describing whether photon i passed through layer j (endpoint-inclusive). It would look something like this (with nlayers = 10):
[[ 1 1 1 0 0 0 0 0 0 0 0 0]
[ 1 1 1 1 1 1 1 0 0 0 0 0]]
I could do this by iterating over the photons and generating each row of traveled_through individually, but that's rather slow, and sort of defeats the point of running many photons at once, so I'd rather not do that.
I tried to define the array as follows:
traveled_through = np.zeros((2, nlayers)).astype(int)
traveled_through[ : , np.min(pastpresent, axis = 1) : np.max(pastpresent, axis = 1) + ] = 1
The idea was that in a given photon's row, the indices from the starting layer through and including the ending layer would be set to 1, with all others remaining 0. However, I get the following error:
traveled_through[ : , np.min(pastpresent, axis = 1) : np.max(pastpresent, axis = 1) + 1 ] = 1
IndexError: invalid slice
My best guess is that numpy does not allow different rows of an array to be indexed differently using this method. Does anyone have suggestions for how to generate traveled_through for an arbitrary number of photons and an arbitrary number of layers?
If the two photons always start at 0, you could perhaps construct your array as follows.
First setting the variables...
>>> pastpresent = np.array([[0, 2], [0, 6]])
>>> nlayers = 10
...and then constructing the array:
>>> (pastpresent[:,1][:,np.newaxis] + 1 > np.arange(nlayers+2)).astype(int)
array([[1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0]])
Or if the photons have an arbitrary starting layer:
>>> pastpresent2 = np.array([[1, 7], [3, 9]])
>>> (pastpresent2[:,0][:,np.newaxis] < np.arange(nlayers+2)) &
(pastpresent2[:,1][:,np.newaxis] + 1 > np.arange(nlayers+2)).astype(int)
array([[0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0]])
A little trick I kind of like for this kind of thing involves the accumulate method of the logical_xor ufunc:
>>> a = np.zeros(10, dtype=int)
>>> b = [3, 7]
>>> a[b] = 1
>>> a
array([0, 0, 0, 1, 0, 0, 0, 1, 0, 0])
>>> np.logical_xor.accumulate(a, out=a)
array([0, 0, 0, 1, 1, 1, 1, 0, 0, 0])
Note that this sets to 1 the entries between the positions in b, first index inclusive, last index exclusive, so you have to handle off by 1 errors depending on what exactly you are after.
With several rows, you could make it work as:
>>> a = np.zeros((3, 10), dtype=int)
>>> b = np.array([[1, 7], [0, 4], [3, 8]])
>>> b[:, 1] += 1 # handle the off by 1 error
>>> a[np.arange(len(b))[:, None], b] = 1
>>> a
array([[0, 1, 0, 0, 0, 0, 0, 0, 1, 0],
[1, 0, 0, 0, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 0, 0, 0, 0, 1]])
>>> np.logical_xor.accumulate(a, axis=1, out=a)
array([[0, 1, 1, 1, 1, 1, 1, 1, 0, 0],
[1, 1, 1, 1, 1, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 1, 1, 1, 1, 1, 0]])

Python - creating a list with 2 characteristics bug

The goal is to create a list of 99 elements. All elements must be 1s or 0s. The first element must be a 1. There must be 7 1s in total.
import random
import math
import time
# constants determined through testing
generation_constant = 0.96
def generate_candidate():
coin_vector = []
coin_vector.append(1)
for i in range(0, 99):
random_value = random.random()
if (random_value > generation_constant):
coin_vector.append(1)
else:
coin_vector.append(0)
return coin_vector
def validate_candidate(vector):
vector_sum = sum(vector)
sum_test = False
if (vector_sum == 7):
sum_test = True
first_slot = vector[0]
first_test = False
if (first_slot == 1):
first_test = True
return (sum_test and first_test)
vector1 = generate_candidate()
while (validate_candidate(vector1) == False):
vector1 = generate_candidate()
print vector1, sum(vector1), validate_candidate(vector1)
Most of the time, the output is correct, saying something like
[1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 1, 0, 0, 0] 7 True
but sometimes, the output is:
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] 2 False
What exactly am I doing wrong?
I'm not certain I understand your requirements, but here's what it sounds like you need:
#!/usr/bin/python3
import random
ones = [ 1 for i in range(6) ]
zeros = [ 0 for i in range(99 - 6) ]
list_ = ones + zeros
random.shuffle(list_)
list_.insert(0, 1)
print(list_)
print(list_.count(1))
print(list_.count(0))
HTH
The algorithm you gave works, though it's slow. Note that the ideal generation_constant can actually be calculated using the binomial distribution. The optimum is &approx;0.928571429 which will fit the conditions 1.104% of the time. If you set the first element to 1 manually, then the optimum generation_constant is &approx;0.93877551 which will fit the conditions 16.58% of the time.
The above is based on the binomial distribution, which says that the probability of having exactly k "success" events out of N total tries where each try has probability p will be P( k | N, p ) = N! * p ^ k * (1 - p) ^ (N - k) / ( n! * (N - k)). Just stick that into Excel, Mathematica, or a graphing calculator and maximize P.
Alternatively:
To generate a list of 99 numbers where the first and 6 additional items are 1 and the remaining elements are 0, you don't need to call random.random so much. Generating pseudo-random numbers is very expensive.
There are two ways to avoid calling random so much.
The most processor efficient way is to only call random 6 times, for the 6 ones you need to insert:
import random
# create vector of 99 0's
vector = [0 for i in range(99)]
# set first element to 1
vector[0] = 1
# list of locations of all 0's
indexes = range(1, 99)
# only need to loop 6 times for remaining 6 ones
for i in range(6):
# select one of the 0 locations at random
# "pop" it from the list so it can't be selected again
# and set it's coresponding element in vector to 1.
vector[indexes.pop(random.randint(0, len(indexes) - 1))] = 1
Alternatively, to save on memory, you can just test each new index to make sure it will actually set something:
import random
# create vector of 99 0's
vector = [0 for i in range(99)]
# only need to loop 7 times
for i in range(7):
index = 0 # first element is set to 1 first
while vector[index] == 1: # keep calling random until a 0 is found
index = random.randint(0, 98) # random index to check/set
vector[index] = 1 # set the random (or first) element to 1
The second one will always set the first element to 1 first, because index = random.randint(0, 98) only ever gets called if vector[0] == 1.
With genetic programming you want to control your domain so that invalid configurations are eliminated as much as possible. The fitness is suppose to rate valid configurations, not eliminate invalid configurations. Honestly this problem doesn't really seem to be a good fit for genetic programming. You have outlined the domain. But I don't see a fitness description anywhere.
Anyway, that being said, the way I would populate the domain would be: since the first element is always 1, ignore it, since the remaining 98 only have 6 ones, shuffle in 6 ones to 92 zeros. Or even enumerate the possible as your domain isn't very large.
I have a feeling it is your use of sum(). I believe this modifies the list in place:
>>> mylist = [1,2,3,4]
>>> sum(mylist)
10
>>> mylist
[]
Here's a (somewhat) pythonic recursive version
def generate_vector():
generation_constant = .96
myvector = [1]+[ 1 if random.random() > generation_constant else 0 for i in range(0,99)]
mysum = 0
for a in myvector:
mysum = (mysum + a)
if mysum == 7 and myvector[0]==1:
return myvector
return generate_vector()
and for good measure
def generate_test():
for i in range(0,10000):
vector = generate_vector()
sum = 0
for a in vector:
sum = sum + a
if sum != 7 or vector[0]!=1:
print vector
output:
>>> generate_test()
>>>

Categories

Resources