This question is probably very simple but for the life of me I can't figure it out. Basically, I have a neuron whose voltage I'm modeling, but I have it receiving input spikes from other neurons randomly. So a friend of mine helped to create a function that essentially has some excitatory neurons provide a random Poisson spike which increases the voltage randomly and some inhibitory neurons providing downward spikes lowering the voltage. I've included the code below. Basically the step I'm trying to figure out how to do is how to make the I_syn term in the iterative step work. I would normally think to just write I_syn[i-1], but that gives me an error:
'function' object has no attribute '__getitem__'.
So I'm sure this question is really simple, but it's a problem I don't know how to overcome. How do I get this program to iterate the I_syn term properly so I can do a basic iterative scheme of an ODE while including a function defined previously in the code? It's important because I'll likely have more complicated neuron equations in the near future, so it would be much better to write the functions beforehand and then call them into the iteration step as needed. Thank you!
from numpy import *
from pylab import *
## setup parameters and state variables
T = 50 # total time to simulate (msec)
dt = 0.125 # simulation time step (msec)
time = arange(0, T+dt, dt) # time array
t_rest = 0 # initial refractory time
## LIF properties
Vm = zeros(len(time)) # potential (V) trace over time
Rm = 1 # resistance (kOhm)
Cm = 10 # capacitance (uF)
tau_m = Rm*Cm # time constant (msec)
tau_ref = 4 # refractory period (msec)
Vth = 1 # spike threshold (V)
V_spike = 0.5 # spike delta (V)
## Stimulus
I = 1.5 # input current (A)
N = 1000
N_ex = 0.8*N #(0..79)
N_in = 0.2*N #(80..99)
G_ex = 0.1
K = 4
def I_syn(spks, t):
"""
Synaptic current
spks = [[synid, t],]
"""
if len(spks) == 0:
return 0
exspk = spks[spks[:,0]<N_ex] # Check for all excitatory spikes
delta_k = exspk[:,1] == t # Delta function
if np.any(delta_k) > 0:
h_k = np.random.rand(len(delta_k)) < 0.90 # probability of successful transmission
else:
h_k = 0
inspk = spks[spks[:,0] >= N_ex] #Check remaining neurons for inhibitory spikes
delta_m = inspk[:,1] == t #Delta function for inhibitory neurons
if np.any(delta_m) > 0:
h_m = np.random.rand(len(delta_m)) < 0.90
else:
h_m = 0
isyn = C_m*G_ex*(np.sum(h_k*delta_k) - K*np.sum(h_m*delta_m))
return isyn
## iterate over each time step
for i, t in enumerate(time):
if t > t_rest:
Vm[i] = Vm[i-1] + (-Vm[i-1] + I_syn*Rm) / tau_m * dt
if Vm[i] >= Vth:
Vm[i] += V_spike
t_rest = t + tau_ref
## plot membrane potential trace
plot(time, Vm)
title('Leaky Integrate-and-Fire Example')
ylabel('Membrane Potential (V)')
xlabel('Time (msec)')
ylim([0,2])
show()
I_syn is just a function so using I_syn[i-1] will throw this error:
'function' object has no attribute '__getitem__'
If what you are looking for is a return value from the function, then you should first call it and then access what you want.
# pass related arguments as well since the function expects it
I_syn(arg1, arg2)[i-1]
Related
I am working on designing an experiment, where a system can receive 5 inputs [x1, x2, x3, x4, x5] and record target data associated with this "state". Given a new input state, the system will then transition from the old to the new state with a certain speed (these are constants) and record the target value along the path. Given the constraint in total time the system can run, what is the best subset of states that can be selected and the optimal trajectory to maximize the total diversity of data points generated?
The following are the variables we use:
x1 = np.arange(5, 37, 2)
x2 = np.arange(5, 53, 3)
x3 = np.arange(0, 3, 1)
x4 = np.arange(0, 4.3, 0.3)
x5 = np.arange(-3, 3.2, 0.2)
There are a total of 357120 possible state combinations
Total experiment time cannot exceed 20 minutes (or 1200 seconds)
When a new state is reached, the system remains stationary for 5 seconds before beginning to move to a new state
The following are the rates of change (per 1/10 of a second) for each variable:
x1_speed = 0.1
x2_speed = 0.1
x3_speed = 0.005
x4_speed = 0.13
x5_speed = 0.1
The best strategy I could come up with so far is to randomly initialize subset of states within the constraints of total time allowed, calculate total Euclidian distance between all points in the data and record the subset of states that maximizes that distance as measure of diversity of data points. However, inspecting some interactions in 2D plane still leaves chunks of space not mapped (per picture below).
Result of random state generation & optimization for Euclidian distance
I've looked into path optimization problems, but given the number of possible states * transitions to simulate all possible edges between all possible nodes seemed like wrong approach. Given that the points selected for experiment have to be optimized and are not pre-defined complicates things as well. If it is a diversity maximization LP problem with some constraints, how do you make it recursive to map out the whole path? Alternatively, was reading more about Reinforcement Learning approach.
The code I used to accomplish the above is as follows. I skipped the part where I record the states, but there's extra lines to populate states_df, which then I use to calculate total Euclidian distance. I can add it if necessary, but please let me know if there is a better strategy/algorithm to solve this problem given some of the constraints, thanks!
import numpy as np
import pandas as pd
from scipy.spatial.distance import pdist
def state_gen():
var1 = np.random.choice(x1)
var2 = np.random.choice(x2)
var3 = round(np.random.choice(x3), 1)
var4 = round(np.random.choice(x4), 1)
var5 = round(np.random.choice(x5), 1)
state = [var1, var2, var3, var4, var5]
return state
def state_change(state_old):
state_new = state_gen()
#x1
x1_travel = state_old[0] - state_new[0]
x1_time = abs(x1_travel/x1_speed) #in seconds
#x2
x2_travel = state_old[1] - state_new[1]
x2_time = abs(x2_travel/x2_speed) #in seconds
#x3
x3_travel = state_old[2] - state_new[2]
x3_time = abs(x3_travel/x3_speed) #in seconds
#x4
x4_travel = state_old[3] - state_new[3]
x4_time = abs(x4_travel/x4_speed) #in seconds
#x5
x5_travel = state_old[4] - state_new[4]
x5_time = abs(x5_travel/x5_speed) #in seconds
#Record values to **states_df** here, add to return statement
t_change = max(x1_time, x2_time, x3_time, x4_time, x5_time)
return state_new, t_change
def objective():
#Initialize first state
state_old = state_gen()
t_tot = 0
while t_tot < 11500: #Little under 20 minutes in tenths of seconds
state_new, t_change, = state_change(state_old)
#System stationary for 5 seconds
t_change += 50
t_tot += t_change
state_init = state_new
tot_euc = pdist(states_df.values, metric='euclidean').sum()
return tot_euc, tot_time, states_df
iter_nums = 250000
tot_euc_max = 0
for i in range(0, iter_nums):
tot_euc, tot_time, states_df = objective()
if tot_euc > tot_euc_max:
tot_euc_max = tot_euc
print('Maximum {} attained'.format(tot_euc_max))
print('Total run time: {}'.format(tot_time))
I want to optimize the operation of a grid-coupled PV battery system with Pyomo. I assume a given timeseries of PV power and electricty load in kWh and a static grid selling and buying price.
But I do not want to implement a perfect foresight approach - 1 optimization problem, where the optimizer knows all timesteps. Instead the optimizer should only see only some timestps into the future. In the following mwe the solver should see the next 6 timesteps for the optimization of the overall 12 timesteps.
Question1: I wonder if this is the most efficient implementation (in terms of computing time)?
#%%
import pyomo.environ as pyo
import pandas as pd
#%% Define data
data_import = pd.DataFrame(data={'pv': [0,0,0,0,0,0,0,0,10,20,40,100],
'load': [20,20,20,20,20,20,20,30,30,30,30,30]},
index=[1,2,3,4,5,6,7,8,9,10,11,12])
#%% Define chunked data
chunked_data = list()
chunk_size = 6
# Add repeated index starting at 1 until chunk_size
data_import.insert(loc=0, column='index', value=list(range(1,chunk_size+1))+list(range(1,chunk_size+1)))
# Chunk data into chunked_data of length 6
for i in range(0, len(data_import), chunk_size):
chunked_data.append(data_import[i:i+chunk_size])
# results list of optimization results
results_battery_soc = []
results_grid_power_import = []
results_grid_power_export = []
#%% Define model for each chunked_data
for i in range(0, len(chunked_data)):
print(i)
# Get chunked data list
data = chunked_data[i]
if i == 0:
bat_soc_initial = 0.5
else:
bat_soc_initial = results_battery_soc[-1]
# Define model
model = pyo.ConcreteModel()
# Define timeperiods
model.T = pyo.RangeSet(len(data))
#%%
# Defien general parameter
model.grid_cost_buy = 0.30
model.grid_cost_sell = 0.10
# Define model parameters from chunked input data
model.pv_power = pyo.Param(model.T, initialize=data.set_index('index')['pv'].to_dict())
model.load_power = pyo.Param(model.T, initialize=data.set_index('index')['load'].to_dict())
#%% Create a block for a single time period
def block_rule(b, t):
# define the parameters
b.battery_soc_initial_value = pyo.Param(initialize=bat_soc_initial)
b.battery_eff = pyo.Param(initialize=0.9)
b.battery_eoCH = pyo.Param(initialize=1.0)
b.battery_eoDCH = pyo.Param(initialize=0.1)
b.battery_capacity = pyo.Param(initialize=380)
# define the variables
b.battery_power_CH = pyo.Var(domain=pyo.NonNegativeReals)
b.battery_power_DCH = pyo.Var(domain=pyo.NonNegativeReals)
b.battery_soc = pyo.Var(bounds=(b.battery_eoDCH, b.battery_eoCH))
b.battery_soc_initial = pyo.Var()
b.grid_power_import = pyo.Var(domain=pyo.NonNegativeReals)
b.grid_power_export = pyo.Var(domain=pyo.NonNegativeReals)
## Define the constraints
# Balanced electricity bus rule
def balanced_bus_rule(_b):
return (0 == (_b.model().pv_power[t] - _b.model().load_power[t]
+ _b.battery_power_DCH - _b.battery_power_CH
+ _b.grid_power_import - _b.grid_power_export))
b.bus_c = pyo.Constraint(rule=balanced_bus_rule)
# Battery end of CH/ constraints
def battery_end_of_CH_rule(_b):
return (_b.battery_eoCH >= _b.battery_soc)
b.battery_eoCH_c = pyo.Constraint(rule=battery_end_of_CH_rule)
# Battery end of DCH constraints
def battery_end_of_DCH_rule(_b):
return (_b.battery_eoDCH <= _b.battery_soc)
b.battery_eoDCH_c = pyo.Constraint(rule=battery_end_of_DCH_rule)
# Battery SoC constraint
def battery_soc_rule(_b):
return (_b.battery_soc == _b.battery_soc_initial +
((_b.battery_power_CH - _b.battery_power_DCH) / _b.battery_capacity))
b.battery_soc_c = pyo.Constraint(rule=battery_soc_rule)
# Initialize Blocks for each timestep defined in T
model.pvbatb = pyo.Block(model.T, rule=block_rule)
#model.pprint()
#%% Further constraints
# link the battery SoC variables between different timesteps
def battery_soc_linking_rule(m, t):
if t == m.T.first():
return m.pvbatb[t].battery_soc_initial == m.pvbatb[t].battery_soc_initial_value
return m.pvbatb[t].battery_soc_initial == m.pvbatb[t-1].battery_soc
model.battery_soc_linking = pyo.Constraint(model.T, rule=battery_soc_linking_rule)
#%% Objective function
# define the cost function
def obj_rule(m):
return sum(m.pvbatb[t].grid_power_import * m.grid_cost_buy
- m.pvbatb[t].grid_power_export * m.grid_cost_sell for t in m.T)
model.obj = pyo.Objective(rule=obj_rule, sense=pyo.minimize)
#%% Solve the problem
solver = pyo.SolverFactory('glpk')
results = solver.solve(model)
#%% Access of results
for t in model.T:
results_battery_soc.append(pyo.value(model.pvbatb[t].battery_soc))
results_grid_power_import.append(pyo.value(model.pvbatb[t].grid_power_import))
results_grid_power_export.append(pyo.value(model.pvbatb[t].grid_power_export))
Question2: This example divides the 12 timestep long timeseries into 2 timeseries of each 6 timesteps and optimizes them individually - 2 optimization problems. The number of sold problems further increases if one optimizes the timeseries of timestep 1-6, then 2-7, then 3-8, then 4-9, then 5-10, then 6-11 and finally 7-12, which results into 7 optimization problems. In this case the question of a efficient implementation becomes even more important.
Thank you so much for your support and proposals. I am tottaly stucked at this point!
GEKKO is optimization software for mixed-integer and differential algebraic equations. It is coupled with large-scale solvers for linear, quadratic, nonlinear, and mixed integer programming (LP, QP, NLP, MILP, MINLP).
I use gekko to control my TCLab Arduino, but when I give a disturbance, no matter how I adjust the parameters, there will be a overshoot temperature. How can I solve this problem?
Here is my code:
import tclab
import numpy as np
import time
import matplotlib.pyplot as plt
from gekko import GEKKO
# Connect to Arduino
a = tclab.TCLab()
# Get Version
print(a.version)
# Turn LED on
print('LED On')
a.LED(100)
# Run time in minutes
run_time = 60.0
# Number of cycles
loops = int(60.0*run_time)
tm = np.zeros(loops)
# Temperature (K)
T1 = np.ones(loops) * a.T1 # temperature (degC)
Tsp1 = np.ones(loops) * 35.0 # set point (degC)
# heater values
Q1s = np.ones(loops) * 0.0
#########################################################
# Initialize Model
#########################################################
# use remote=True for MacOS
m = GEKKO(name='tclab-mpc',remote=False)
# 100 second time horizon
m.time = np.linspace(0,100,101)
# Parameters
Q1_ss = m.Param(value=0)
TC1_ss = m.Param(value=a.T1)
Kp = m.Param(value=0.8)
tau = m.Param(value=160.0)
# Manipulated variable
Q1 = m.MV(value=0)
Q1.STATUS = 1 # use to control temperature
Q1.FSTATUS = 0 # no feedback measurement
Q1.LOWER = 0.0
Q1.UPPER = 100.0
Q1.DMAX = 50.0
# Q1.COST = 0.0
Q1.DCOST = 0.2
# Controlled variable
TC1 = m.CV(value=TC1_ss.value)
TC1.STATUS = 1 # minimize error with setpoint range
TC1.FSTATUS = 1 # receive measurement
TC1.TR_INIT = 2 # reference trajectory
TC1.TR_OPEN = 2 # reference trajectory
TC1.TAU = 35 # time constant for response
m.Equation(tau * TC1.dt() + (TC1-TC1_ss) == Kp * (Q1-Q1_ss))
# Global Options
m.options.IMODE = 6 # MPC
m.options.CV_TYPE = 1 # Objective type
m.options.NODES = 2 # Collocation nodes
m.options.SOLVER = 1 # 1=APOPT, 3=IPOPT
##################################################################
# Create plot
plt.figure()
plt.ion()
plt.show()
filter_tc1 = []
def movefilter(predata, new, n):
if len(predata) < n:
predata.append(new)
else:
predata.pop(0)
predata.append(new)
return np.average(predata)
# Main Loop
start_time = time.time()
prev_time = start_time
try:
for i in range(1,loops):
# Sleep time
sleep_max = 1.0
sleep = sleep_max - (time.time() - prev_time)
if sleep>=0.01:
time.sleep(sleep)
else:
time.sleep(0.01)
# Record time and change in time
t = time.time()
dt = t - prev_time
prev_time = t
tm[i] = t - start_time
# Read temperatures in Kelvin
curr_T1 = a.T1
last_T1 = curr_T1
avg_T1 = movefilter(filter_tc1, last_T1, 3)
T1[i] = curr_T1
###############################
### MPC CONTROLLER ###
###############################
TC1.MEAS = avg_T1
# input setpoint with deadband +/- DT
DT = 0.1
TC1.SPHI = Tsp1[i] + DT
TC1.SPLO = Tsp1[i] - DT
# solve MPC
m.solve(disp=False)
# test for successful solution
if (m.options.APPSTATUS==1):
# retrieve the first Q value
Q1s[i] = Q1.NEWVAL
else:
# not successful, set heater to zero
Q1s[i] = 0
# Write output (0-100)
a.Q1(Q1s[i])
# Plot
plt.clf()
ax=plt.subplot(2,1,1)
ax.grid()
plt.plot(tm[0:i],T1[0:i],'ro',MarkerSize=3,label=r'$T_1$')
plt.plot(tm[0:i],Tsp1[0:i],'b-',MarkerSize=3,label=r'$T_1 Setpoint$')
plt.ylabel('Temperature (degC)')
plt.legend(loc='best')
ax=plt.subplot(2,1,2)
ax.grid()
plt.plot(tm[0:i],Q1s[0:i],'r-',LineWidth=3,label=r'$Q_1$')
plt.ylabel('Heaters')
plt.xlabel('Time (sec)')
plt.legend(loc='best')
plt.draw()
plt.pause(0.05)
# Turn off heaters
a.Q1(0)
a.Q2(0)
print('Shutting down')
a.close()
# Allow user to end loop with Ctrl-C
except KeyboardInterrupt:
# Disconnect from Arduino
a.Q1(0)
a.Q2(0)
print('Shutting down')
a.close()
# Make sure serial connection still closes when there's an error
except:
# Disconnect from Arduino
a.Q1(0)
a.Q2(0)
print('Error: Shutting down')
a.close()
raise
There is the test result picture.
When you add the disturbance (such as turn on the other heater), the apparent system gain increases because the temperature rises higher than anticipated by the controller. That means you start to go left on the mismatch plot (leads to worst control performance).
This is Figure 14 in Hedengren, J. D., Eaton, A. N., Overview of Estimation Methods for Industrial Dynamic Systems, Optimization and Engineering, Springer, Vol 18 (1), 2017, pp. 155-178, DOI: 10.1007/s11081-015-9295-9.
One of the reasons for the overshoot is because of model mismatch. Here are a few ways to deal with this:
Increase your model gain K (maybe to 1) or decrease your model tau (maybe to 120) so that the controller becomes less aggressive. You may also want to re-identify your model so that it better reflects your TCLab system dynamics. Here is a tutorial on getting a first order or second order model. A higher order ARX model also works well for the TCLab.
Change the reference trajectory to be less aggressive with TC.TAU=50 and include the reference trajectory on the plot so that you can observe what the controller is planning. I also like to include the unbiased model on the plot to show how the model is performing.
Check out this Control Tuning page for help with other MV and CV tuning options. The Jupyter notebook widget can help give you an intuitive understanding of those options.
Edited to include VBA code for comparison
Also, we know the analytical value, which is 8.021, towards which the Monte-Carlo should converge, which makes the comparison easier.
Excel VBA gives 8.067 based on averaging 5 Monte-Carlo simulations (7.989, 8.187, 8.045, 8.034, 8.075)
Python gives 7.973 based on 5 MCs (7.913, 7.915, 8.203, 7.739, 8.095) and a larger Variance!
The VBA code is not even "that good", using a rather bad way to produce samples from Standard Normal!
I am running a super simple code in Python to price European Call Option via Monte Carlo, and I am surprised at how "bad" the convergence is with 10,000 "simulated paths". Usually, when running a Monte-Carlo for this simple problem in C++ or even VBA, I get better convergence.
I show the code below (the code is taken from Textbook "Python for Finance" and I run in in Visual Studio Code under Python 3.7.7, 64-bit version): I get the following results, as an example: Run 1 = 7.913, Run 2 = 7.915, Run 3 = 8.203, Run 4 = 7.739, Run 5 = 8.095,
Results such as the above, that differ by so much, would be unacceptable. How can the convergence be improved??? (Obviously by running more paths, but as I said: for 10,000 paths, the result should already have converged much better):
#MonteCarlo valuation of European Call Option
import math
import numpy as np
#Parameter Values
S_0 = 100. # initial value
K = 105. # strike
T = 1.0 # time to maturity
r = 0.05 # short rate (constant)
sigma = 0.2 # vol
nr_simulations = 10000
#Valuation Algo:
# Notice the vectorization below, instead of a loop
z = np.random.standard_normal(nr_simulations)
# Notice that the S_T below is a VECTOR!
S_T = S_0 * np.exp((r-0.5*sigma**2)+math.sqrt(T)*sigma*z)
#Call option pay-off at maturity (Vector!)
C_T = np.maximum((S_T-K),0)
# C_0 is a scalar
C_0 = math.exp(-r*T)*np.average(C_T)
print('Value of the European Call is: ', C_0)
I also include VBA code, which produces slightly better results (in my opinion): with the VBA code below, I get 7.989, 8.187, 8.045, 8.034, 8.075.
Option Explicit
Sub monteCarlo()
' variable declaration
' stock initial & final values, option pay-off at maturity
Dim stockInitial, stockFinal, optionFinal As Double
' r = rate, sigma = volatility, strike = strike price
Dim r, sigma, strike As Double
'maturity of the option
Dim maturity As Double
' instatiate variables
stockInitial = 100#
r = 0.05
maturity = 1#
sigma = 0.2
strike = 105#
' normal is Standard Normal
Dim normal As Double
' randomNr is randomly generated nr via "rnd()" function, between 0 & 1
Dim randomNr As Double
' variable for storing the final result value
Dim result As Double
Dim i, j As Long, monteCarlo As Long
monteCarlo = 10000
For j = 1 To 5
result = 0#
For i = 1 To monteCarlo
' get random nr between 0 and 1
randomNr = Rnd()
'max(Rnd(), 0.000000001)
' standard Normal
normal = Application.WorksheetFunction.Norm_S_Inv(randomNr)
stockFinal = stockInitial * Exp((r - (0.5 * (sigma ^ 2))) + (sigma * Sqr(maturity) * normal))
optionFinal = max((stockFinal - strike), 0)
result = result + optionFinal
Next i
result = result / monteCarlo
result = result * Exp(-r * maturity)
Worksheets("sheet1").Cells(j, 1) = result
Next j
MsgBox "Done"
End Sub
Function max(ByVal number1 As Double, ByVal number2 As Double)
If number1 > number2 Then
max = number1
Else
max = number2
End If
End Function
I don't think there is anything wrong with Python or numpy internals, the convergence is definitely should be the same no matter what tool you're using. I ran a few simulations with different sample sizes and different sigma values. No surprise, it turns out the speed of convergence is heavily controlled by the sigma value, see the plot below. Note that x axis is on log-scale! After the bigger oscillations fade away there are more smaller waves before it stabilizes. The easiest to see at sigma=0.5.
I'm definitely not an expert, but I think the most obvious solution is to increase sample size, as you mentioned. It would be nice to see results and code from C++ or VBA, because I don't know how familiar you are with numpy and python functions. Maybe something is not doing what you think it's doing.
Code to generate the plot (let's not talk about efficiency, it's horrible):
import numpy as np
import matplotlib.pyplot as plt
S_0 = 100. # initial value
K = 105. # strike
T = 1.0 # time to maturity
r = 0.05 # short rate (constant)
fig = plt.figure()
ax = fig.add_subplot()
plt.xscale('log')
samplesize = np.geomspace(1000, 20000000, 64)
sigmas = np.arange(0, 0.7, 0.1)
for s in sigmas:
arr = []
for n in samplesize:
n = n.astype(int)
z = np.random.standard_normal(n)
S_T = S_0 * np.exp((r-0.5*s**2)+np.sqrt(T)*s*z)
C_T = np.maximum((S_T-K),0)
C_0 = np.exp(-r*T)*np.average(C_T)
arr.append(C_0)
ax.scatter(samplesize, arr, label=f'sigma={s:.2f}')
plt.tight_layout()
plt.xlabel('Sample size')
plt.ylabel('Value')
plt.grid()
handles, labels = ax.get_legend_handles_labels()
plt.legend(handles[::-1], labels[::-1], loc='upper left')
plt.show()
Addition:
This time you got closer results to the real value using VBA. But sometimes you don't. The effect of randomness is too big here. The truth is averaging out only 5 results from a low sample number simulation is not meaningful. For example averaging out 50 different simulations in Python (with only n=10000, even though you shouldn't do that if you're willing to get the right answer) yields to 8.025167 (± 0.039717 with 95% confidence level), which is very close to the real solution.
How can i generate a random walk data between a start-end values
while not passing over the maximum value and not going under the minimum value?
Here is my attempt to do this but for some reason sometimes the series goes over the max or under the min values. It seems that the Start and the End value are respected but not the minimum and the maximum value. How can this be fixed? Also i would like to give the standard deviation for the fluctuations but don't know how. I use a randomPerc for fluctuation but this is wrong as i would like to specify the std instead.
import numpy as np
import matplotlib.pyplot as plt
def generateRandomData(length,randomPerc, min,max,start, end):
data_np = (np.random.random(length) - randomPerc).cumsum()
data_np *= (max - min) / (data_np.max() - data_np.min())
data_np += np.linspace(start - data_np[0], end - data_np[-1], len(data_np))
return data_np
randomData=generateRandomData(length = 1000, randomPerc = 0.5, min = 50, max = 100, start = 66, end = 80)
## print values
print("Max Value",randomData.max())
print("Min Value",randomData.min())
print("Start Value",randomData[0])
print("End Value",randomData[-1])
print("Standard deviation",np.std(randomData))
## plot values
plt.figure()
plt.plot(range(randomData.shape[0]), randomData)
plt.show()
plt.close()
Here is a simple loop which checks for series that go under the minimum or over the maximum value. This is exactly what i am trying to avoid. The series should be distributed between the given limits for min and max values.
## generate 1000 series and check if there are any values over the maximum limit or under the minimum limit
for i in range(1000):
randomData = generateRandomData(length = 1000, randomPerc = 0.5, min = 50, max = 100, start = 66, end = 80)
if(randomData.min() < 50):
print(i, "Value Lower than Min limit")
if(randomData.max() > 100):
print(i, "Value Higher than Max limit")
As you impose conditions on your walk, it can not be considered purely random. Anyway, one way is to generate the walk iteratively, and check the boundaries on each iteration. But if you wanted a vectorized solution, here it is:
def bounded_random_walk(length, lower_bound, upper_bound, start, end, std):
assert (lower_bound <= start and lower_bound <= end)
assert (start <= upper_bound and end <= upper_bound)
bounds = upper_bound - lower_bound
rand = (std * (np.random.random(length) - 0.5)).cumsum()
rand_trend = np.linspace(rand[0], rand[-1], length)
rand_deltas = (rand - rand_trend)
rand_deltas /= np.max([1, (rand_deltas.max()-rand_deltas.min())/bounds])
trend_line = np.linspace(start, end, length)
upper_bound_delta = upper_bound - trend_line
lower_bound_delta = lower_bound - trend_line
upper_slips_mask = (rand_deltas-upper_bound_delta) >= 0
upper_deltas = rand_deltas - upper_bound_delta
rand_deltas[upper_slips_mask] = (upper_bound_delta - upper_deltas)[upper_slips_mask]
lower_slips_mask = (lower_bound_delta-rand_deltas) >= 0
lower_deltas = lower_bound_delta - rand_deltas
rand_deltas[lower_slips_mask] = (lower_bound_delta + lower_deltas)[lower_slips_mask]
return trend_line + rand_deltas
randomData = bounded_random_walk(1000, lower_bound=50, upper_bound =100, start=50, end=100, std=10)
You can see it as a solution of geometric problem. The trend_line is connecting your start and end points, and have margins defined by lower_bound and upper_bound. rand is your random walk, rand_trend it's trend line and rand_deltas is it's deviation from the rand trend line. We collocate the trend lines, and want to make sure that deltas don't exceed margins. When rand_deltas exceeds the allowed margin, we "fold" the excess back to the bounds.
At the end you add the resulting random deltas to the start=>end trend line, thus receiving the desired bounded random walk.
The std parameter corresponds to the amount of variance of the random walk.
update : fixed assertions
In this version "std" is not promised to be the "interval".
I noticed you used built in functions as arguments (min and max) which is not reccomended (I changed these to max_1 and min_1). Other than this your code should work as expected:
def generateRandomData(length,randomPerc, min_1,max_1,start, end):
data_np = (np.random.random(length) - randomPerc).cumsum()
data_np *= (max_1 - min_1) / (data_np.max() - data_np.min())
data_np += np.linspace(start - data_np[0], end - data_np[-1],len(data_np))
return data_np
randomData=generateRandomData(1000, 0.5, 50, 100, 66, 80)
If you are willing to modify your code this will work:
import random
for_fill=[]
# generate 1000 samples within the specified range and save them in for_fill
for x in range(1000):
generate_rnd_df=random.uniform(50,100)
for_fill.append(generate_rnd_df)
#set starting and end point manually
for_fill[0]=60
for_fill[999]=80
Here is one way, very crudely expressed in code.
>>> import random
>>> steps = 1000
>>> start = 66
>>> end = 80
>>> step_size = (50,100)
Generate 1,000 steps assured to be within the required range.
>>> crude_walk_steps = [random.uniform(*step_size) for _ in range(steps)]
>>> import numpy as np
Turn these steps into a walk but notice that they fail to meet the requirements.
>>> crude_walk = np.cumsum(crude_walk_steps)
>>> min(crude_walk)
57.099056617839288
>>> max(crude_walk)
75048.948693623403
Calculate a simple linear transformation to scale the steps.
>>> from sympy import *
>>> var('a b')
(a, b)
>>> solve([57.099056617839288*a+b-66,75048.948693623403*a+b-80])
{b: 65.9893403510312, a: 0.000186686954219243}
Scales the steps.
>>> walk = [0.000186686954219243*_+65.9893403510312 for _ in crude_walk]
Verify that the walk now starts and stops where intended.
>>> min(walk)
65.999999999999986
>>> max(walk)
79.999999999999986
You can also generate a stream of random walks and filter out those that do not meet your constraints. Just be aware that by filtering they are not really 'random' anymore.
The code below creates an infinite stream of 'valid' random walks. Be careful with
very tight constraints, the 'next' call might take a while ;).
import itertools
import numpy as np
def make_random_walk(first, last, min_val, max_val, size):
# Generate a sequence of random steps of lenght `size-2`
# that will be taken bewteen the start and stop values.
steps = np.random.normal(size=size-2)
# The walk is the cumsum of those steps
walk = steps.cumsum()
# Performing the walk from the start value gives you your series.
series = walk + first
# Compare the target min and max values with the observed ones.
target_min_max = np.array([min_val, max_val])
observed_min_max = np.array([series.min(), series.max()])
# Calculate the absolute 'overshoot' for min and max values
f = np.array([-1, 1])
overshoot = (observed_min_max*f - target_min_max*f)
# Calculate the scale factor to constrain the walk within the
# target min/max values.
# Don't upscale.
correction_base = [walk.min(), walk.max()][np.argmax(overshoot)]
scale = min(1, (correction_base - overshoot.max()) / correction_base)
# Generate the scaled series
new_steps = steps * scale
new_walk = new_steps.cumsum()
new_series = new_walk + first
# Check the size of the final step necessary to reach the target endpoint.
last_step_size = abs(last - new_series[-1]) # step needed to reach desired end
# Is it larger than the largest previously observed step?
if last_step_size > np.abs(new_steps).max():
# If so, consider this series invalid.
return None
else:
# Else, we found a valid series that meets the constraints.
return np.concatenate((np.array([first]), new_series, np.array([last])))
start = 66
stop = 80
max_val = 100
min_val = 50
size = 1000
# Create an infinite stream of candidate series
candidate_walks = (
(i, make_random_walk(first=start, last=stop, min_val=min_val, max_val=max_val, size=size))
for i in itertools.count()
)
# Filter out the invalid ones.
valid_walks = ((i, w) for i, w in candidate_walks if w is not None)
idx, walk = next(valid_walks) # Get the next valid series
print(
"Walk #{}: min/max({:.2f}/{:.2f})"
.format(idx, walk.min(), walk.max())
)