Input code is:
# Input data:
S = pd.S = 2000 # Saturation flow
L = pd.L = 5 # Lost time
eb = pd.eb = 1000
wb = pd.wb = 600
sb = pd.sb = 400
nb = pd.nb = 500
# a) C_min = Minimum cycle length calculation
Y_eb = pd.Y_eb = eb / S
Y_wb = pd.Y_wb = wb / S
Y_sb = pd.Y_sb = sb / S
Y_nb = pd.Y_nb = nb / S
Y_eb_wb_sb_nb = [Y_eb,Y_wb,Y_sb,Y_nb]
Y_eb_wb_sb_nb
Output:
[0.5, 0.3, 0.2, 0.25]
Then
if Y_eb > Y_wb:
print(C_min = L / 1 - (Y_eb + Y_wb))
I want to:
Get maximum values from (Y_eb;Y_wb) and (Y_sb;Y_nb) and apply these values to formula:
C_min = L / (1- [max of (Y_eb;Y_wb)] + [max of (Y_sb;Y_nb)])
Use max built-in fuction:
C_min = L / (1- max(Y_eb,Y_wb) + max(Y_sb,Y_nb))
python has a built-in max function, that give the max of a list...
max(iterable, *[, key, default])
max(arg1, arg2, *args[, key])
"Return the largest item in an iterable or the largest of two or more
arguments"
https://docs.python.org/3/library/functions.html#max
Answer:
C_min = L / (1- max([Y_eb, Y_wb]) + max([Y_sb, Y_nb]))
Related
I'm building a Genetic Algorithm to maximize this function: x^5 - 10x^3 + 30x - y^2 + 21y.
The code must be in binary and the bounds for x and y are [-2.5, 2.5]. To generate the initial population I made a 16 bit string for both x and y where:
The first bit represents the signal [0 or 1]
The the second and third bit represents the integer part [00, 01 or 10]
The rest represents the float part
This is the function that generates the initial population:
def generate_population(n_pop):
population = list()
for _ in range(n_pop):
aux = list()
for _ in range(2):
signal = bin(randint(0, 1))[2:]
int_part = bin(randint(0, 2))[2:].zfill(2)
float_part = bin(randint(0, 5000))[2:].zfill(13)
aux.append((signal+int_part+float_part))
population.append(aux)
return population
I also made a function that returns the binary number into float:
def convert_float(individual):
float_num = list()
for i in range(2):
signal = int(individual[i][0])
int_part = int(individual[i][1:3], 2)
float_part = int(individual[i][3:], 2) * (10 ** -4)
value = round(int_part + float_part, 4)
if value > 2.5:
value = 2.5
if signal == 1:
value = value * (-1)
float_num.append(value)
return float_num
And lastly this function that calculate the fitness of each individual:
def get_fitness(individual):
x = individual[0]
y = individual[1]
return x ** 5 - 10 * x ** 3 + 30 * x - y ** 2 + 21 * y
This is my main function:
def ga(n_pop=10, n_iter=10):
population = generate_population(n_pop)
best_fitness_id, best_fitness = 0, get_fitness(convert_float(population[0]))
for i in range(n_iter):
float_population = [convert_float(x) for x in population]
fitness_population = [get_fitness(x) for x in float_population]
for j in range(n_pop):
if fitness_population[j] > best_fitness:
best_fitness_id, best_fitness = j, fitness_population[j]
print(f'--> NEW BEST FOUND AT GENERATION {i}:')
print(f'{float_population[j]} = {fitness_population[j]}')
selected_parents = rank_selection()
# childrens = list()
# childrens = childrens + population[best_fitness_id] # ELITE
After running the program I have something like this:
The population looks like: [['0000001100110111', '0000110111110101'], ['0010011111101110', '1000100101001001'], ...
The float population: [[0.0823, 0.3573], [1.203, -0.2377], ...
And the fitness values: [9.839066068044746, 16.15145434928624, ...
I need help to build the rank_selection() function, I've been stuck in this selection for 2 days. I know is something 1/N, 2/N etc and I've seen tons of examples in multiple languages but I could not apply any of them to this particular algorithm and it MUST be rank selecion.
I already know how to perform crossover and mutation.
Here is a simple example, which numerically integrates the product of two Gaussian pdfs. One of the Gaussians is fixed, with mean always at 0. The other Gaussian varies in its mean:
import time
import jax.numpy as np
from jax import jit
from jax.scipy.stats.norm import pdf
# set up evaluation points for numerical integration
integr_resolution = 6400
lower_bound = -100
upper_bound = 100
integr_grid = np.linspace(lower_bound, upper_bound, integr_resolution)
proba = pdf(integr_grid)
integration_weight = (upper_bound - lower_bound) / integr_resolution
# integrate with new mean
def integrate(mu_new):
x_new = integr_grid - mu_new
proba_new = pdf(x_new)
total_proba = sum(proba * proba_new * integration_weight)
return total_proba
print('starting jit')
start = time.perf_counter()
integrate = jit(integrate)
integrate(1)
stop = time.perf_counter()
print('took: ', stop - start)
The function looks seemingly simple, but it doesn't scale at all. The following list contains pairs of (value for integr_resolution, time it took to run the code):
100 | 0.107s
200 | 0.23s
400 | 0.537s
800 | 1.52s
1600 | 5.2s
3200 | 19s
6400 | 134s
For reference, the unjitted function, applied to integr_resolution=6400 takes 0.02s.
I thought that this might be related to the fact that the function is accessing a global variable. But moving the code to set up the integration points inside of the function has no notable influence on the timing. The following code takes 5.36s to run. It corresponds to the table entry with 1600 which previously took 5.2s:
# integrate with new mean
def integrate(mu_new):
# set up evaluation points for numerical integration
integr_resolution = 1600
lower_bound = -100
upper_bound = 100
integr_grid = np.linspace(lower_bound, upper_bound, integr_resolution)
proba = pdf(integr_grid)
integration_weight = (upper_bound - lower_bound) / integr_resolution
x_new = integr_grid - mu_new
proba_new = pdf(x_new)
total_proba = sum(proba * proba_new * integration_weight)
return total_proba
What is happening here?
I also answered this at https://github.com/google/jax/issues/1776, but adding the answer here too.
It's because the code uses sum where it should use np.sum.
sum is a Python built-in that extracts each element of a sequence and sums them one by one using the + operator. This has the effect of building a large, unrolled chain of adds which XLA takes a long time to compile.
If you use np.sum, then JAX builds a single XLA reduction operator, which is much faster to compile.
And just to show how I figured this out: I used jax.make_jaxpr, which dumps JAX's internal trace representation of a function. Here, it shows:
In [3]: import jax
In [4]: jax.make_jaxpr(integrate)(1)
Out[4]:
{ lambda b c ; ; a.
let d = convert_element_type[ new_dtype=float32
old_dtype=int32 ] a
e = sub c d
f = sub e 0.0
g = pow f 2.0
h = div g 1.0
i = add 1.8378770351409912 h
j = neg i
k = div j 2.0
l = exp k
m = mul b l
n = mul m 2.0
o = slice[ start_indices=(0,)
limit_indices=(1,)
strides=(1,)
operand_shape=(100,) ] n
p = reshape[ new_sizes=()
dimensions=None
old_sizes=(1,) ] o
q = add p 0.0
r = slice[ start_indices=(1,)
limit_indices=(2,)
strides=(1,)
operand_shape=(100,) ] n
s = reshape[ new_sizes=()
dimensions=None
old_sizes=(1,) ] r
t = add q s
u = slice[ start_indices=(2,)
limit_indices=(3,)
strides=(1,)
operand_shape=(100,) ] n
v = reshape[ new_sizes=()
dimensions=None
old_sizes=(1,) ] u
w = add t v
x = slice[ start_indices=(3,)
limit_indices=(4,)
strides=(1,)
operand_shape=(100,) ] n
y = reshape[ new_sizes=()
dimensions=None
old_sizes=(1,) ] x
z = add w y
... similarly ...
and it's then obvious why this is slow: the program is very big.
Contrast the np.sum version:
In [5]: def integrate(mu_new):
...: x_new = integr_grid - mu_new
...:
...: proba_new = pdf(x_new)
...: total_proba = np.sum(proba * proba_new * integration_weight)
...:
...: return total_proba
...:
In [6]: jax.make_jaxpr(integrate)(1)
Out[6]:
{ lambda b c ; ; a.
let d = convert_element_type[ new_dtype=float32
old_dtype=int32 ] a
e = sub c d
f = sub e 0.0
g = pow f 2.0
h = div g 1.0
i = add 1.8378770351409912 h
j = neg i
k = div j 2.0
l = exp k
m = mul b l
n = mul m 2.0
o = reduce_sum[ axes=(0,)
input_shape=(100,) ] n
in [o] }
Hope that helps!
I want to detect and store outliers from a list and this is what I am doing
Code:
def outliers(y,thresh=3.5):
m = np.median(y)
abs_dev = np.abs(y - m)
left_mad = np.median(abs_dev[y <= m])
right_mad = np.median(abs_dev[y >= m])
y_mad = left_mad * np.ones(len(y))
y_mad[y > m] = right_mad
modified_z_score = 0.6745 * abs_dev / y_mad
modified_z_score[y == m] = 0
return modified_z_score > thresh
bids = [5000,5500,4500,1000,15000,5200,4900]
z = outliers(bids)
bidd = np.array(bids)
out_liers = bidd[z]
This gives results as:
out_liers = array([ 1000, 15000])
Is there a better way to do this, where I don't get the results in array but in a list?
Also please can someone explain me why we used
thresh=3.5
modified_z_score = 0.6745 * abs_dev / y_mad
This works:
def outliers_modified_z_score(ys, threshold=3.5):
ys_arr = np.array(ys)
median_y = np.median(ys_arr)
median_absolute_deviation_y = np.median(np.abs(ys_arr - median_y))
modified_z_scores = 0.6745 * (ys_arr - median_y) / median_absolute_deviation_y
return (ys_arr[np.abs(modified_z_scores) > threshold]).tolist()
That's because you are using numpy function. Default type used there is numpy.ndarray, which speeds up the computations. In the case you just need a list as output argument, use tolist() method.
z = outliers(bids)
bidd = np.array(bids)
out_liers = bidd[z].tolist()
I have a pandas dataframe with 6 million rows. The columns are:
['x', 'y']
I need to apply a simple calculation between x an y, and append it to the dataframe.
This is what I've tried:
'''
Calculates the height of a pressure level in feet
'''
def pressure_to_elevation(P, T = None):
sea_level_pressure = 1013.25
if T is not None:
# https://www.omnicalculator.com/physics/air-pressure-at-altitude
P0 = sea_level_pressure
g = 9.80665
M = 0.0289644
R0 = 8.31447
m = (np.log(P/P0)*T) / -(g*M/R0)
f = 3.28084 * m
return f
b = 0.190284
c = 145366.45
return (1-math.pow((P/sea_level_pressure), b)) * c
test_df['result'] = test_fd.apply(lambda row: pressure_to_elevation(row['x'], row['y']),axis=1)
Unfortunately, this takes a ridiculous amount of time... in fact, I've yet to see it complete.
Is there a faster way to do this?
Try this:
def pressure_to_elevation(P, T):
sea_level_pressure = 1013.25
P0 = sea_level_pressure
g = 9.80665
M = 0.0289644
R0 = 8.31447
b = 0.190284
c = 145366.45
return np.where(T.notnull(),
3.28084 * ((np.log(P/P0)*T) / -(g*M/R0)),
(1-np.pow((P/sea_level_pressure), b)) * c)
Usage:
test_df['result'] = pressure_to_elevation(test_df['x'], test_df['y'])
I believe if you break this out into separate steps and avoid iterating through the entire dataframe, the speed will increase dramatically. Give the following a shot.
test_df['result_1'] = (test_df['x']/sea_level_pressure)
test_df['result_1'] = test_df['result']**0.190284
test_df['result_1'] = (1 - test_df['result'])*145366.45
test_df['result_2'] = 3.28084*((np.log(test_df['x']/sea_level_pressure)*test_df['y'])/(-1*(9.80665*0.0289644/8.31447)))
test_df['final_result'] = np.where(pd.isnull(test_df['y']), test_df['result_1'], test_df['result_2'])
I'm a new learner of python programming. Recently I'm trying to write a "tool" program of "dynamic programming" algorithm. However, the last part of my programe -- a while loop, failed to loop. the code is like
import numpy as np
beta, rho, B, M = 0.5, 0.9, 10, 5
S = range(B + M + 1) # State space = 0,...,B + M
Z = range(B + 1) # Shock space = 0,...,B
def U(c):
"Utility function."
return c**beta
def phi(z):
"Probability mass function, uniform distribution."
return 1.0 / len(Z) if 0 <= z <= B else 0
def Gamma(x):
"The correspondence of feasible actions."
return range(min(x, M) + 1)
def T(v):
"""An implementation of the Bellman operator.
Parameters: v is a sequence representing a function on S.
Returns: Tv, a list."""
Tv = []
for x in S:
# Compute the value of the objective function for each
# a in Gamma(x), and store the result in vals (n*m matrix)
vals = []
for a in Gamma(x):
y = U(x - a) + rho * sum(v[a + z]*phi(z) for z in Z)
# the place v comes into play, v is array for each state
vals.append(y)
# Store the maximum reward for this x in the list Tv
Tv.append(max(vals))
return Tv
# create initial value
def v_init():
v = []
for i in S:
val = []
for j in Gamma(i):
# deterministic
y = U(i-j)
val.append(y)
v.append(max(val))
return v
# Create an instance of value function
v = v_init()
# parameters
max_iter = 10000
tol = 0.0001
num_iter = 0
diff = 1.0
N = len(S)
# value iteration
value = np.empty([max_iter,N])
while (diff>=tol and num_iter<max_iter ):
v = T(v)
value[num_iter] = v
diff = np.abs(value[-1] - value[-2]).max()
num_iter = num_iter + 1
As you can see, the while loop at the bottom is used to iterate over "value function" and find the right answer. However, the while fails to loop, and just return num_iter=1. As for I know, the while loop "repeats a sequence of statements until some condition becomes false", clearly, this condition will not be satisfied until the diff converge to near 0
The major part of code works just fine, as far as I use the following for loop
value = np.empty([num_iter,N])
for x in range(num_iter):
v = T(v)
value[x] = v
diff = np.abs(value[-1] - value[-2]).max()
print(diff)
You define value as np.empty(...). That means that it is composed completely of zeros. The difference, therefore, between the last element and the second-to-last element will be zero. 0 is not >= 0.0001, so that expression will be False. Therefore, your loop breaks.