Python - creating a list with 2 characteristics bug - python

The goal is to create a list of 99 elements. All elements must be 1s or 0s. The first element must be a 1. There must be 7 1s in total.
import random
import math
import time
# constants determined through testing
generation_constant = 0.96
def generate_candidate():
coin_vector = []
coin_vector.append(1)
for i in range(0, 99):
random_value = random.random()
if (random_value > generation_constant):
coin_vector.append(1)
else:
coin_vector.append(0)
return coin_vector
def validate_candidate(vector):
vector_sum = sum(vector)
sum_test = False
if (vector_sum == 7):
sum_test = True
first_slot = vector[0]
first_test = False
if (first_slot == 1):
first_test = True
return (sum_test and first_test)
vector1 = generate_candidate()
while (validate_candidate(vector1) == False):
vector1 = generate_candidate()
print vector1, sum(vector1), validate_candidate(vector1)
Most of the time, the output is correct, saying something like
[1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 1, 0, 0, 0] 7 True
but sometimes, the output is:
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] 2 False
What exactly am I doing wrong?

I'm not certain I understand your requirements, but here's what it sounds like you need:
#!/usr/bin/python3
import random
ones = [ 1 for i in range(6) ]
zeros = [ 0 for i in range(99 - 6) ]
list_ = ones + zeros
random.shuffle(list_)
list_.insert(0, 1)
print(list_)
print(list_.count(1))
print(list_.count(0))
HTH

The algorithm you gave works, though it's slow. Note that the ideal generation_constant can actually be calculated using the binomial distribution. The optimum is ≈0.928571429 which will fit the conditions 1.104% of the time. If you set the first element to 1 manually, then the optimum generation_constant is ≈0.93877551 which will fit the conditions 16.58% of the time.
The above is based on the binomial distribution, which says that the probability of having exactly k "success" events out of N total tries where each try has probability p will be P( k | N, p ) = N! * p ^ k * (1 - p) ^ (N - k) / ( n! * (N - k)). Just stick that into Excel, Mathematica, or a graphing calculator and maximize P.
Alternatively:
To generate a list of 99 numbers where the first and 6 additional items are 1 and the remaining elements are 0, you don't need to call random.random so much. Generating pseudo-random numbers is very expensive.
There are two ways to avoid calling random so much.
The most processor efficient way is to only call random 6 times, for the 6 ones you need to insert:
import random
# create vector of 99 0's
vector = [0 for i in range(99)]
# set first element to 1
vector[0] = 1
# list of locations of all 0's
indexes = range(1, 99)
# only need to loop 6 times for remaining 6 ones
for i in range(6):
# select one of the 0 locations at random
# "pop" it from the list so it can't be selected again
# and set it's coresponding element in vector to 1.
vector[indexes.pop(random.randint(0, len(indexes) - 1))] = 1
Alternatively, to save on memory, you can just test each new index to make sure it will actually set something:
import random
# create vector of 99 0's
vector = [0 for i in range(99)]
# only need to loop 7 times
for i in range(7):
index = 0 # first element is set to 1 first
while vector[index] == 1: # keep calling random until a 0 is found
index = random.randint(0, 98) # random index to check/set
vector[index] = 1 # set the random (or first) element to 1
The second one will always set the first element to 1 first, because index = random.randint(0, 98) only ever gets called if vector[0] == 1.

With genetic programming you want to control your domain so that invalid configurations are eliminated as much as possible. The fitness is suppose to rate valid configurations, not eliminate invalid configurations. Honestly this problem doesn't really seem to be a good fit for genetic programming. You have outlined the domain. But I don't see a fitness description anywhere.
Anyway, that being said, the way I would populate the domain would be: since the first element is always 1, ignore it, since the remaining 98 only have 6 ones, shuffle in 6 ones to 92 zeros. Or even enumerate the possible as your domain isn't very large.

I have a feeling it is your use of sum(). I believe this modifies the list in place:
>>> mylist = [1,2,3,4]
>>> sum(mylist)
10
>>> mylist
[]
Here's a (somewhat) pythonic recursive version
def generate_vector():
generation_constant = .96
myvector = [1]+[ 1 if random.random() > generation_constant else 0 for i in range(0,99)]
mysum = 0
for a in myvector:
mysum = (mysum + a)
if mysum == 7 and myvector[0]==1:
return myvector
return generate_vector()
and for good measure
def generate_test():
for i in range(0,10000):
vector = generate_vector()
sum = 0
for a in vector:
sum = sum + a
if sum != 7 or vector[0]!=1:
print vector
output:
>>> generate_test()
>>>

Related

How I can formulat my optimization problem with Gekko?

I want to formulate the objective function (minimization problem): sum[sum[Ri*{Pi² + (Qi - Qcj*Xij)²}for j in range(Nc)] for i in range(N) ] with P and Q are the constants, Qc is a list of proposed solution and X is our decision variable (binary variable). I'm trying to get the vector X which minimizes the objective function.
here is my attempt:
from gekko import GEKKO
import numpy as np
P=[13.10511598922975,11.2611396806742,10.103920431906348,8.199519500182628,6.411296067052755,4.753519719147589,3.8977762462825973,2.6593092284662734,1.6399999999854893]
Q=[5.06643685386732,4.4344047044589585,3.8082608015186405,3.2626022579039584,1.2568869621197523,0.6152693459109657,0.46237064874523776,0.35226399840832523,0.20000000001140983]
R=[0.1233, 0.014, 0.7463, 0.6984, 1.9831, 0.9053, 2.0552, 4.7953, 5.3434]
Qc=[150, 300, 450, 600,750, 900,1050, 1200,1350,1500,1650,1800,1950,2100,2250,2400,2550,2700,2850,3000,3150,3300,3450,3600,3750,3900,4050]
N=len(Q)
Nc=len(Qc)
m = GEKKO(remote=False)
X = m.Array(m.Var,(N,Nc),integer=True,lb=0,ub=1,value=0)
#convirtir P et Q en KW
for i in range(N):
Q[i]=Q[i]*1000
P[i]=P[i]*1000
#constrainte ## one per line
for i in range(N):
m.Equation(m.sum([X[i][j]for j in range(Nc)])<=1)
b=m.sum([m.sum([R[i]*((P[i]**2)+((Q[i])-Qc[j]*X[i][j])**2) for j in range(Nc)]) for i in range(N)])
m.Minimize(b)
I tried 3 methods:
method 1:
m.options.SOLVER = 1
m.solve()
method 2:
bv = np.array([[0, 0, 0, 0, 0, 0, 0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
[0, 0, 0, 0, 0, 0, 0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1],
[0, 0, 0, 0, 0, 0, 0, 0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0],
[0, 0, 0, 0, 0, 0, 0, 0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
[0, 0, 0, 0, 0, 0, 1, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
[0, 0, 0, 0, 0, 0, 0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
[0, 0, 0, 0, 0, 0, 0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
[0, 0, 0, 0, 1, 0, 0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
[0, 0, 0, 0, 1, 0, 0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]])
for i in range(N):
for j in range(Nc):
X[i,j].value = bv[i,j]
m.options.SOLVER = 1
m.solve()
method 3:
m.options.SOLVER = 3
m.solve(debug=0, disp=True)
m.options.SOLVER = 1
m.solve(debug=0, disp=True)
here is my attempt:
I tried 3 methods:
method 1:
method 2
method 3:
The 3 methods don't give me the optimal solution.
Use the solver options to get a better solution by not terminating when the gap_tol is met at 1e-3 (default). The gap_tol is an early termination criterion that helps obtain MINLP solutions faster, but with a less optimal solution. Setting gap_tol to zero and minlp_max_iter_with_int_sol to a large number will iterate through all remaining potential solutions. The computational time increases so I recommend a smaller gap_tol such as 1e-5 and minlp_max_iter_with_int_sol 2000.
m.solver_options = ['minlp_gap_tol 1e-5',\
'minlp_maximum_iterations 10000',\
'minlp_max_iter_with_int_sol 2000']
m.options.SOLVER=1
m.solve(disp=True)
This gives a solution with an objective of 9.535e9.
---------------------------------------------------
Solver : APOPT (v1.0)
Solution time : 170.673799999990 sec
Objective : 9535331689.96189
Successful solution
---------------------------------------------------
The objective with the initial guess fixed is 9.541e9.
bv = np.array([[0, 0, 0, 0, 0, 0, 0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
[0, 0, 0, 0, 0, 0, 0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1],
[0, 0, 0, 0, 0, 0, 0, 0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0],
[0, 0, 0, 0, 0, 0, 0, 0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
[0, 0, 0, 0, 0, 0, 1, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
[0, 0, 0, 0, 0, 0, 0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
[0, 0, 0, 0, 0, 0, 0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
[0, 0, 0, 0, 1, 0, 0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
[0, 0, 0, 0, 1, 0, 0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]])
for i in range(N):
for j in range(Nc):
X[i,j].value = bv[i,j]
m.Equation(X[i,j]==bv[i,j])
---------------------------------------------------
Solver : APOPT (v1.0)
Solution time : 2.180000001681037E-002 sec
Objective : 9540896947.56266
Successful solution
---------------------------------------------------
A couple other suggestions that didn't help the solution accuracy, but may be things to consider in the future:
The speed of solution can be improved with this alternative for expressing the objective function.
b=[sum([R[i]*((P[i]**2)+((Q[i])-Qc[j]*X[i][j])**2)
for j in range(Nc)]) for i in range(N)]
[m.Minimize(bi) for bi in b]
The objective function is quite high >1e9 at the solution. You could consider increasing the solver tolerance and keeping the problem in MW versus kW by removing these lines.
#convirtir P et Q en KW
#for i in range(N):
# Q[i]=Q[i]*1000
# P[i]=P[i]*1000

How to create a flow generator for a given iterable object?

Write a function that produces stream generator for given iterable object (list, generator, etc) whose elements contain position and value and sorted by order of apperance. Stream generator should be equal to initial stream (without position) but gaps filled with zeroes. For example:
gen = gen_stream(9,[(4,111),(7,12)])
list(gen) [0, 0, 0, 0, 111, 0, 0, 12, 0] # first element has zero index, so 111 located on fifth position, 12 located on 8th position
I.e. 2 significant elements has indexes 4 and 7, all other elements filled with zeroes.
To simplify things elements are sorted (i.e element with lower position should precede element with higher number) in initial stream.
First parameter can be None, in this case stream should be inifinite, e.g. infinite zeroes stream:
gen_stream(None, [])
following stream starts with 0, 0, 0, 0, 111, 0, 0, 12, ... then infinitely generates zeroes:
gen_stream(None, [(4,111),(7,12)])
Function should also support custom position-value extractor for more advanced cases, e.g.
def day_extractor(x):
months = [31,28,31,30,31,31,30,31,30,31,30,31]
acc = sum(months[:x[1]-1]) + x[0] - 1
return (acc, x[2])
precipitation_days = [(3,1,4),(5,2,6)]
list(gen_stream(59,precipitation_days,day_extractor)) #59: January and February to limit output
[0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
precipitation_days format is following: (d,m,mm), where d - day in month, m - month, mm - precipitation in millimeters
So, in example:
(3,1,4) - January,3 precipitation: 4 mm
(5,2,6) - February,5 precipitation: 6 mm
Extractor passed as optional third parameter with default value - lambda function that handles (position, value) pairs like in first example.
That's what i did:
import sys
a=[(4,111),(7,12)]
n = 9
def gen_stream(n1, a1):
if n1==None:
b = [0 for i in range(sys.maxsize)]
else:
b = [0 for i in range(n1)]
for i in range(len(a1)):
b[a[i][0]]=a[i][1]
for i in range(len(b)):
yield b[i]
for i in gen_stream(None, a):
print(i)
So far I have reached a stream with infinite zeros, but the function is not executed for some reason ... And how to do it next with months? My memory error crashes, and the program eats a lot of RAM (((help please

Itertools with conditions in python 3

I am trying to generate some vectors with numbers [0....k-1], and with length k^n. n and k were given before.
k = 4
n = 2
args = list(product(range(k), repeat=n))
# vector=str([i for i in range(k)]*(n+1))
for i in product(range(k), repeat=k ** n):
if (check(i, args)): print(i)
Commented line is not important,it was my idea.
I need to generate this vectors with condition: I want to see each number from [0;k-1] in my vectors more or equal to (n) times. So it is task about permutations with replacements with special conditions to control numbers I can get. What shall I do?
For example I have k=2,n=2 vector from 4 elements and want to see 0 and 1 TWO or more times.
I should get 0011 0101 0110 1001 1010 1100
Everything is easy in example, but when k=5,n=2 (for example) there are 25-size vector and i want to see 0 1 2 3 4 2 times and other 17 numbers should be from 0 1 2 3 4 it becomes difficult.
UPDATE:
Here is a solution that generates the necessary combinations only. It is in principle faster, although the complexity is still exponential and you can quickly hit the limits of recursion.
def my_vectors(k, n):
# Minimum repetitions per element
base_repetitions = [n] * k
# "Unassigned" repetitions
rest = k ** n - k * n
# List reused for permutation construction
permutation = [-1] * (k ** n)
# For each possible repetition assignment
for repetitions in make_repetitions(base_repetitions, rest):
# Make all possible permutations
yield from make_permutations(repetitions, permutation)
# Finds all possible repetition assignments
def make_repetitions(repetitions, rest, first=0):
if rest <= 0:
yield repetitions
else:
for i in range(first, len(repetitions)):
repetitions[i] += 1
yield from make_repetitions(repetitions, rest - 1, i)
repetitions[i] -= 1
# Make all permutations with repetitions
def make_permutations(repetitions, permutation, idx=0):
if idx >= len(permutation):
yield list(permutation)
# If you are going to use the permutation within a loop only
# maybe you can avoid copying the list and do just:
# yield permutation
else:
for elem in range(len(repetitions)):
if repetitions[elem] > 0:
repetitions[elem] -= 1
permutation[idx] = elem
yield from make_permutations(repetitions, permutation, idx + 1)
repetitions[elem] += 1
for v in my_vectors(3, 2):
print(v)
Output:
(0, 0, 0, 0, 0, 1, 1, 2, 2)
(0, 0, 0, 0, 0, 1, 2, 1, 2)
(0, 0, 0, 0, 0, 1, 2, 2, 1)
(0, 0, 0, 0, 0, 2, 1, 1, 2)
(0, 0, 0, 0, 0, 2, 1, 2, 1)
(0, 0, 0, 0, 0, 2, 2, 1, 1)
(0, 0, 0, 0, 1, 0, 1, 2, 2)
(0, 0, 0, 0, 1, 0, 2, 1, 2)
(0, 0, 0, 0, 1, 0, 2, 2, 1)
(0, 0, 0, 0, 1, 1, 0, 2, 2)
...
This is an inefficient but simple way to implement it:
from itertools import product
from collections import Counter
def my_vectors(k, n):
for v in product(range(k), repeat=k ** n):
count = Counter(v)
if all(count[i] >= n for i in range(k)):
yield v
for v in my_vectors(3, 2):
print(v)
Output:
(0, 0, 0, 0, 0, 1, 1, 2, 2)
(0, 0, 0, 0, 0, 1, 2, 1, 2)
(0, 0, 0, 0, 0, 1, 2, 2, 1)
(0, 0, 0, 0, 0, 2, 1, 1, 2)
(0, 0, 0, 0, 0, 2, 1, 2, 1)
(0, 0, 0, 0, 0, 2, 2, 1, 1)
(0, 0, 0, 0, 1, 0, 1, 2, 2)
(0, 0, 0, 0, 1, 0, 2, 1, 2)
(0, 0, 0, 0, 1, 0, 2, 2, 1)
(0, 0, 0, 0, 1, 1, 0, 2, 2)
...
Obviously, as soon as your numbers get slightly bigger it will take forever to run, so it is only useful either for very small problems or as a baseline for comparison.
In any case, the number of items that the problem produces is exponentially large anyway, so although you can make it significantly better (i.e. generate only the right elements instead of all the possible ones and discarding), it cannot be "fast" for any size.

Find and flat repeated values in numpy array

I want to find values in an np array that are repeated more than x times and set them to 0.
Lets say this is my array:
[255,0,0,255,255,255,0,0,255,255,255,255,255,0,0]
I want to set to 0 all parts that are repeated more than x times.
Lets say, x = 3, the output array will be:
[255,0,0,255,255,255,0,0,0,0,0,0,0,0,0]
If x = 2:
[255,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
Of course, I can loop over the indexes, count them and set to 0, but there's got to be a faster and more efficient way (the purpose is to remove horizontal grids from an image).
Using pandas
s = pd.Series(x)
n = 5
s.groupby((s != s.shift()).cumsum()).apply(lambda z: z if z.size < n else pd.Series([0]*z.size)).values
array([255, 0, 0, 255, 255, 255, 0, 0, 0, 0, 0, 0, 0, 0, 0], dtype=int64)
n = 2
array([255, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], dtype=int64)
You may be able to solve this by viewing at your data using a rolling window with length x+1 and hopsize 1. If all values in this window are equal, set them all to zero. Rolling windows can easily be done using SciKit image's view_as_windows():
import numpy
import skimage
x = 3
data = numpy.asarray([255,0,0,255,255,255,0,0,255,255,255,255,255,0,0])
data_view = skimage.util.view_as_windows(data, window_shape=(x + 1,))
mask = numpy.all(numpy.isclose(data_view, data_view[..., 0, None]), axis=1)
data_view[mask, :] = 0
data
# array([255, 0, 0, 255, 255, 255, 0, 0, 0, 0, 0, 0, 0, 0, 0])

Simultaneous changing of python numpy array elements

I have a vector of integers from range [0,3], for example:
v = [0,0,1,2,1,3, 0,3,0,2,1,1,0,2,0,3,2,1].
I know that I can replace a specific values of elements in the vector by other value using the following
v[v == 0] = 5
which changes all appearences of 0 in vector v to value 5.
But I would like to do something a little bit different - I want to change all values of 0 (let's call them target values) to 1, and all values different from 0 to 0, thus I want to obtain the following:
v = [1,1,0,0,0,0,1,0,1,0,0,0,1,0,1,0,0,0]
However, I cannot call the substitution code (which I used above) as follows:
v[v==0] = 1
v[v!=0] = 0
because this obviously leeds to a vector of zeros.
Is it possible to do the above substitution in a parralel way, to obtain the desired vector? (I want to have a universal technique, which will allow me to use it even if I will change what is my target value). Any suggestions will be very helpful!
You can check if v is equal to zero and then convert the boolean array to int, and so if the original value is zero, the boolean is true and converts to 1, otherwise 0:
v = np.array([0,0,1,2,1,3, 0,3,0,2,1,1,0,2,0,3,2,1])
(v == 0).astype(int)
# array([1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0])
Or use numpy.where:
np.where(v == 0, 1, 0)
# array([1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0])

Categories

Resources