I am solving a DC Optimal Power Flow Problem and I am trying to brainstorm the most efficient way to iterate over a constraint in Pyomo.
The following is the data structure, where i and k are the nodes connected through a branch, and X is the reactance, a property of the branch.
Sample Branch Data
The constraint I am having trouble with is the following:
constraint
Where the symbol "delta" and "p" is a variable in the constraint, each node has a single value of delta and p. What this constraint basically does, is it insures that all the powers flowing into the node i, from all the connected nodes k, are equal to the existing power value in the same node.
Here is an example for i=1 and i=2, iterations of the constraint.
Sample constraint
So I am trying to find the most efficient way to state this constraint in pyomo, where instead of having several constraint iterations written like this:
def P1_rule(modelo):
return modelo.p[0]-L[0]== ((modelo.d[0]-modelo.d[1])/0.1)+((modelo.d[0]-modelo.d[2])/0.1)
model.P1 = Constraint(rule=P1_rule)
def P2_rule(modelo):
return modelo.p[1]-L[1]==((modelo.d[1]-modelo.d[0])/0.1)+((modelo.d[1]-modelo.d[2])/0.1)
model.P2 = Constraint(rule=P2_rule)
def P3_rule(modelo):
return modelo.p[2]-L[2] ==((modelo.d[2]-modelo.d[0])/0.1)+((modelo.d[2]-modelo.d[1])/0.1)
model.P3 = Constraint(rule=P3_rule)
I want a single line like this, so that it easily generalizes over a huge network:
def P3_rule(modelo):
return modelo.p[i] ==((modelo.d[i]-modelo.d[k])/X[k])
model.P3 = Constraint(rule=P3_rule)
I have came up with a way that includes restructuring the data and creating new arrays of indices etc... I would like to see if i can apply the constraint using the data keeping the same structure and more directly.
Ok so I figured out how to do it. The best way, which I didn't know was possible, is through making an if statement within the summation, so basically make a full conditional iteration inside of the summation. In the code bellow, G is the list of nodes, while "From" and "To" are the branch number columns in the FullBranch data table.
def Try_rule(mod,g):
return mod.p[g] - L[g] == sum((mod.d[i-1]-mod.d[k-1])/FullBranch.loc[x,"X"] for x,(i,k) in enumerate(zip(FullBranch["From"], FullBranch["To"])) if i == g+1)
model.Try = Constraint(G,rule=Try_rule)
Related
I've created a graph G with NetworkX, where my nodes are movies and actors, and there's an edge if an actor partecipated in a movie. I have a dictionary for all actors and all movies. I want to find the pair of movies that share the largest number of actors.
I thought that the solution could be something like that:
maximum=0
pair=[]
dict_pair_movies={}
for actor in actors:
list_movies=list(nx.all_neighbors(G, actor))
for movie1 in list_movies:
for movie2 in list_movies:
if movie1!=movie2:
dict_coppia_movies[(movies1,movies2)]+=1
if dict_coppia_movies[(movies1,movies2)]>massimo:
maximum=dict_coppia_movies[(movies1,movies2)]
pair=[movies1,movies2]
return pair
But this can't really work because there are 2 millions of actors.
I tried if the code could work in a smaller case, but I ran in two problems:
This line dict_coppia_movies[(movies1,movies2)]+=1 doesn't work; But I could get the result that I wanted with this one dict_coppia_movies[(movies1,movies2)]=dict_coppia_movies.get((movies1,movies2),0) + 1
I don't know ho to specify that, if I have two film A and B, the combination "A,B" it's the same of "B,A".
Instead the algorithm creates two different keys.
I even tried something with nx.common_neighbors that should gives me the number of actors of two movies, but the problems were always the quadratic time and my inability to tell the algorithm to iterate only for different movies.
EDIT: Maybe I've found the solution, but I can't check if it's the right one. I thought that the wise road to follow should be with nx.common_neighbors so I could just iterate for two nodes. In order to make the algorithm fast enough, I tried to use the zip function with the list of movies and the set of movies.
movieList=list(movies.keys())
movieSet=set(movieLista)
def question3():
maximum=0
pair=[]
for node1,node2 in zip(movies,movieSet):
neighborsList=(list(nx.common_neighbors(G,node1,node2)))
if len(neighborsList)>maximum:
maximum=len(neighborsList)
pair=[node1,node2]
return pair
This algorithm gives me a result, but I can't really check if it's correct. I knew that the zip function in the case of two list or set with different lenght it will truncate to the shortest one, but in this case movies and movieSet have the same lenght so it should work...
Based on my understanding of what each variable you're working with means, I think that the following should work. This makes use of Joel's "movie size" heuristic.
Notably, the sorting process is O(n log(n)), so it has no impact on the overall O(n2) complexity.
def question3():
def comp_func(pair):
return len(pair[1])
movie_list = sorted([(k,set(d)) for k,d in G.adjacency() if k in movies],
key = comp_func, reverse = True)
maximum = 0
pair = [None,None]
for i,(movie_1,set_1) in enumerate(movie_list):
if len(set_1) <= maximum:
break
for movie_2,set_2 in movie_list[i+1:]:
if len(set_2) <= maximum:
break
m = len(set_1&set_2)
if m > maximum:
maximum = m
pair[:] = movie_1,movie_2
return pair
I'm trying to bring this constraint in my pyomo model
[1
I define a set for indexing over time and I want to optimize the corresponding energy variable below
model.grid_time = Set(initialize=range(0, 23)))
model.charging_energy = Var(model.grid_time, initialize=0)
My constraint definition looks like as follows:
model.limits = ConstraintList()
for t in model.grid_time:
model.limits.add(sum(model.charging_energy[t] for t in model.grid >= energy_demand.at[t,"total_energy_demand"])
The problem with these codelines is that I'm summing over the whole indexing set model.grid_time and not just up to t. I think I need a second variable indexing set (replacing for t in model.grid), but I'm searching unsuccessfully after how creating a variable index set..
I would appreciate any help or comment!
Would something like this work?
def Sum_rule(model, v, t):
return sum(model.Ech[t2] for t2 in model.grid_time if t2 <= t) <= model.Edem[v,t]
model.Sum_constraint = Constraint(model.grid_time, model.V, rule=Sum_rule)
Essentially, what happens is that the t in the Sum_rule(model, v, t) makes sure that the constraint is called for each t in model.grid_times. The t2 in the sum is also part of model.grid_times, but it will only take values that are smaller than the t at which the constraint is called.
I am not sure if my constraint matches exactly your notation, as you have not provided all the information required (e.g. regarding the subscript v of the E^dem variable, but it will basically do what you want with the sum.
I have variables whose values change every hour during the day (24 values):
plants = ['Plant1', 'Plant2']
users = ['user1', 'user2']
time_steps = range(0,24)
p_gen = model.addVars(plants, time_steps, name="pow_gen")
tot_consume = model.addVars(users, time_steps, name="total_demand")
p_grid = model.addVars(time_steps, lb = -GRB.INFINITY, name="exch_pow")
I want to implement something like this:
If ((quicksum(p_gen[t] for pp in plants) - quicksum(tot_d[u,t] for u in users) )>= p_grid[t] for t in time_steps)
model.addConstrs(A)
model.addConstrs(B)
else:
model.addConstrs(C)
My problem is that Gurobi does not understand the variables which depend on the time. I want to implement if the condition, so it depends on the condition the program, will you different addConstr.
How to implement this condition in Gurobi?
Linear Programming doesn't work like this.
You have constraints and your model must fulfill them, otherwise your model is infeasible.
You can't put constraints based on constraints conditions, if anything you can put constraints based on boolean conditions (like a setting, a value...) or you can put boolean constraints.
You can, however, have two models at the same time, with the same vars and constraints before the if / else branches.
You can resolve the first model, get the value you need with the x attribute (just call variable.x to get its value), and with that value you can select which constraints add to second model, and then resolve it.
Recently I read a problem to practice DP. I wasn't able to come up with one, so I tried a recursive solution which I later modified to use memoization. The problem statement is as follows :-
Making Change. You are given n types of coin denominations of values
v(1) < v(2) < ... < v(n) (all integers). Assume v(1) = 1, so you can
always make change for any amount of money C. Give an algorithm which
makes change for an amount of money C with as few coins as possible.
[on problem set 4]
I got the question from here
My solution was as follows :-
def memoized_make_change(L, index, cost, d):
if index == 0:
return cost
if (index, cost) in d:
return d[(index, cost)]
count = cost / L[index]
val1 = memoized_make_change(L, index-1, cost%L[index], d) + count
val2 = memoized_make_change(L, index-1, cost, d)
x = min(val1, val2)
d[(index, cost)] = x
return x
This is how I've understood my solution to the problem. Assume that the denominations are stored in L in ascending order. As I iterate from the end to the beginning, I have a choice to either choose a denomination or not choose it. If I choose it, I then recurse to satisfy the remaining amount with lower denominations. If I do not choose it, I recurse to satisfy the current amount with lower denominations.
Either way, at a given function call, I find the best(lowest count) to satisfy a given amount.
Could I have some help in bridging the thought process from here onward to reach a DP solution? I'm not doing this as any HW, this is just for fun and practice. I don't really need any code either, just some help in explaining the thought process would be perfect.
[EDIT]
I recall reading that function calls are expensive and is the reason why bottom up(based on iteration) might be preferred. Is that possible for this problem?
Here is a general approach for converting memoized recursive solutions to "traditional" bottom-up DP ones, in cases where this is possible.
First, let's express our general "memoized recursive solution". Here, x represents all the parameters that change on each recursive call. We want this to be a tuple of positive integers - in your case, (index, cost). I omit anything that's constant across the recursion (in your case, L), and I suppose that I have a global cache. (But FWIW, in Python you should just use the lru_cache decorator from the standard library functools module rather than managing the cache yourself.)
To solve for(x):
If x in cache: return cache[x]
Handle base cases, i.e. where one or more components of x is zero
Otherwise:
Make one or more recursive calls
Combine those results into `result`
cache[x] = result
return result
The basic idea in dynamic programming is simply to evaluate the base cases first and work upward:
To solve for(x):
For y starting at (0, 0, ...) and increasing towards x:
Do all the stuff from above
However, two neat things happen when we arrange the code this way:
As long as the order of y values is chosen properly (this is trivial when there's only one vector component, of course), we can arrange that the results for the recursive call are always in cache (i.e. we already calculated them earlier, because y had that value on a previous iteration of the loop). So instead of actually making the recursive call, we replace it directly with a cache lookup.
Since every component of y will use consecutively increasing values, and will be placed in the cache in order, we can use a multidimensional array (nested lists, or else a Numpy array) to store the values instead of a dictionary.
So we get something like:
To solve for(x):
cache = multidimensional array sized according to x
for i in range(first component of x):
for j in ...:
(as many loops as needed; better yet use `itertools.product`)
If this is a base case, write the appropriate value to cache
Otherwise, compute "recursive" index values to use, look up
the values, perform the computation and store the result
return the appropriate ("last") value from cache
I suggest considering the relationship between the value you are constructing and the values you need for it.
In this case you are constructing a value for index, cost based on:
index-1 and cost
index-1 and cost%L[index]
What you are searching for is a way of iterating over the choices such that you will always have precalculated everything you need.
In this case you can simply change the code to the iterative approach:
for each choice of index 0 upwards:
for each choice of cost:
compute value corresponding to index,cost
In practice, I find that the iterative approach can be significantly faster (e.g. *4 perhaps) for simple problems as it avoids the overhead of function calls and checking the cache for preexisting values.
I am trying to write a function that will filter a list of tuples (mimicing an in-memory database), using a "nearest neighbour" or "nearest match" type algorithim.
I want to know the best (i.e. most Pythonic) way to go about doing this. The sample code below hopefully illustrates what I am trying to do.
datarows = [(10,2.0,3.4,100),
(11,2.0,5.4,120),
(17,12.9,42,123)]
filter_record = (9,1.9,2.9,99) # record that we are seeking to retrieve from 'database' (or nearest match)
weights = (1,1,1,1) # weights to approportion to each field in the filter
def get_nearest_neighbour(data, criteria, weights):
for each row in data:
# calculate 'distance metric' (e.g. simple differencing) and multiply by relevant weight
# determine the row which was either an exact match or was 'least dissimilar'
# return the match (or nearest match)
pass
if __name__ == '__main__':
result = get_nearest_neighbour(datarow, filter_record, weights)
print result
For the snippet above, the output should be:
(10,2.0,3.4,100)
since it is the 'nearest' to the sample data passed to the function get_nearest_neighbour().
My question then is, what is the best way to implement get_nearest_neighbour()?. For the purpose of brevity etc, assume that we are only dealing with numeric values, and that the 'distance metric' we use is simply an arithmentic subtraction of the input data from the current row.
Simple out-of-the-box solution:
import math
def distance(row_a, row_b, weights):
diffs = [math.fabs(a-b) for a,b in zip(row_a, row_b)]
return sum([v*w for v,w in zip(diffs, weights)])
def get_nearest_neighbour(data, criteria, weights):
def sort_func(row):
return distance(row, criteria, weights)
return min(data, key=sort_func)
If you'd need to work with huge datasets, you should consider switching to Numpy and using Numpy's KDTree to find nearest neighbors. Advantage of using Numpy is that not only it uses more advanced algorithm, but also it's implemented a top of highly optimized LAPACK (Linear Algebra PACKage).
About naive-NN:
Many of these other answers propose "naive nearest-neighbor", which is an O(N*d)-per-query algorithm (d is the dimensionality, which in this case seems constant, so it's O(N)-per-query).
While an O(N)-per-query algorithm is pretty bad, you might be able to get away with it, if you have less than any of (for example):
10 queries and 100000 points
100 queries and 10000 points
1000 queries and 1000 points
10000 queries and 100 points
100000 queries and 10 points
Doing better than naive-NN:
Otherwise you will want to use one of the techniques (especially a nearest-neighbor data structure) listed in:
http://en.wikipedia.org/wiki/Nearest_neighbor_search (most likely linked off from that page), some examples linked:
http://en.wikipedia.org/wiki/K-d_tree
http://en.wikipedia.org/wiki/Locality_sensitive_hashing
http://en.wikipedia.org/wiki/Cover_tree
especially if you plan to run your program more than once. There are most likely libraries available. To otherwise not use a NN data structure would take too much time if you have a large product of #queries * #points. As user 'dsign' points out in comments, you can probaby squeeze out a large additional constant factor of speed by using the numpy library.
However if you can get away with using the simple-to-implement naive-NN though, you should use it.
use heapq.nlargest on a generator calculating the distance*weight for each record.
something like:
heapq.nlargest(N, ((row, dist_function(row,criteria,weight)) for row in data), operator.itemgetter(1))