Mapping and iterating nested dictionaries - python

I am not too familiar with python but have a working understanding of the basics. I believe that I need dictionaries, but what I am currently doing is not working and likely very ineffective time-wise.
I am trying to create a cross matrix that links reviews between users given: the list of reviewers, their individual reviews, metadata related to the reviews.
NOTE : This is written in Python 2.7.10 - I cannot use Python 3 because outdated systems this will be run on, yada yada.
For initialization I have the following:
print '\nCompiling Review Maps... ';
LbidMap = {};
TbidMap = {};
for user in reviewer_idx :
for review in data['Reviewer Reviews'][user] :
reviewInfo = data['Review Information'][review];
stars = float(reviewInfo['stars']);
bid = reviewInfo['business_id'];
# Initialize lists where necessary
# !!!! I know this is probably not effective, but am unsure of
# a better method. Open to suggestions !!!!!
if bid not in LbidMap:
LbidMap[bid] = {};
TbidMap[bid] = {};
if stars not in LbidMap[bid] :
LbidMap[bid][stars] = {};
if user not in TbidMap[bid] :
TbidMap[bid][user] = {};
# Track information on ratings to each business
LbidMap[bid][stars][user] = review;
TbidMap[bid][user][review] = stars;
(where 'bid' is short for "Business ID", pos_list is an input given by user at runtime)
I then go on and try to create a mapping of users who gave a "positive" review to a business T who also gave business L a rating of X (e.g., 5 people rated business L 4/5 stars, how many of those people also gave a "positive" review to business T?)
For mapping I have the following:
# Determine and map all users who rated business L as rL
# and gave business T a positive rating
print '\nCross matching ratings across businesses';
cross_TrL = [];
for Tbid in TbidMap :
for Lbid in LbidMap :
# Ensure T and L aren't the same business
if Tbid != Lbid :
for stars in LbidMap[Lbid] :
starSum = len(LbidMap[Lbid][stars]);
posTbid = 0;
for user in LbidMap[Lbid][stars] :
if user in TbidMap[Tbid] :
rid = LbidMap[Lbid][stars][user];
print 'Tbid:%s Lbid:%s user:%s rid:%s'%(Tbid, Lbid, user, rid);
reviewRate = TbidMap[Tbid][user][rid];
# If true, then we have pos review for T from L
if reviewRate in pos_list :
posTbid += 1;
numerator = posTbid + 1;
denominator = starSum + 1;
probability = float(numerator) / denominator;
I currently receive the following error (print out of current vars also provided):
Tbid:OlpyplEJ_c_hFxyand_Wxw Lbid:W0eocyGliMbg8NScqERaiA user:Neal_1EVupQKZKv3NsC2DA rid:TAIDnnpBMR16BwZsap9uwA
Traceback (most recent call last):
File "run_edge_testAdvProb.py", line 90, in <module>
reviewRate = TbidMap[Tbid][user][rid];
KeyError: u'TAIDnnpBMR16BwZsap9uwA'
So, I know the KeyError is on what should be the rid (review ID) at that particular moment within TbidMap, however it seems to me that the Key was somehow not included within the first code block of initialization.
What am I doing wrong? Additionally, suggestions on how to improve clock cycles on the second code block is welcomed.
EDIT: I realized that I was trying to locate rid of Tbid using the rid from Lbid, however rid is unique to each review so you would not have a Tbid.rid == Lbid.rid.
Updated the second code block, as such:
cross_TrL = [];
for Tbid in TbidMap :
for Lbid in LbidMap :
# Ensure T and L aren't the same business
if Tbid != Lbid :
# Get numer of reviews at EACH STAR rate for L
for stars in LbidMap[Lbid] :
starSum = len(LbidMap[Lbid][stars]);
posTbid = 0;
# For each review check if user rated the Tbid
for Lreview in LbidMap[Lbid][stars] :
user = LbidMap[Lbid][stars][Lreview];
if user in TbidMap[Tbid] :
# user rev'd Tbid, get their Trid
# and see if they gave Tbid a pos rev
for Trid in TbidMap[Tbid][user] :
# Currently this does not account for multiple reviews
# given by the same person. Just want to get this
# working and then I'll minimize this
Tstar = TbidMap[Tbid][user][Trid];
print 'Tbid:%s Lbid:%s user:%s Trid:%s'%(Tbid, Lbid, user, Trid);
if Tstar in pos_list :
posTbid += 1;
numerator = posTbid + 1;
denominator = starSum + 1;
probability = float(numerator) / denominator;
evaluation = {'Tbid':Tbid, 'Lbid':Lbid, 'star':stars, 'prob':probability}
cross_TrL.append(evaluation);
Still slow, but I no longer receive the error.

Related

Get all transactions from OKX with python

I try to make full overview over my transaktions (bye/sell/deposit/withdrawl/earnings and boot trades) with python for Okx, but I get only 2 Trades (but I have made more than 2).
I have tried to send request with orders-history-archive and fetchMyTrades from CCXT Library (have tried some other functions, but I steed don't get my transactions.)
Is there some way to get full overview for Okx with python (and other Brocker/Wallets)?
here How I try to get the data with CCXT (it give only 2 outputs):
def getMyTrades(self):
tData = []
tSymboles = [
'BTC/USDT',
'ETH/USDT',
'SHIB/USDT',
'CELO/USDT',
'XRP/USDT',
'SAMO/USDT',
'NEAR/USDT',
'ETHW/USDT',
'DOGE/USDT',
'SOL/USDT',
'LUNA/USDT'
]
for item in tSymboles:
if exchange.has['fetchMyTrades']:
since = exchange.milliseconds() - 60*60*24*180*1000 # -180 days from now
while since < exchange.milliseconds():
symbol = item # change for your symbol
limit = 20 # change for your limit
orders = exchange.fetchMyTrades(symbol, since, limit)
if len(orders):
since = orders[len(orders) - 1]['timestamp'] + 1
tData += orders
else:
break

Spotipy: 'Index out of range' when trying to retrieve album_ids so that I can get their release dates

I am working with this data, which I extracted from a public playlist and shows all of the number 1's since 1953 with their audio features: https://raw.githubusercontent.com/StanWaldron/StanWaldron.github.io/main/FinalData.csv
I am now trying to loop through and find their album ids so that I can retrieve their release date and plot their audio features against other time series data, using this code:
def find_album_release(name):
album_ids = []
for x in name:
results = sp.search(q="album:" + x, type="album")
if not results["albums"]["items"]:
return []
album_id = results['albums']['items'][0]['uri']
album_ids.append(album_id)
print(album_id)
return album_ids
final = pd.read_csv('FinalData.csv')
albumlist = final['album']
finalalbums = find_album_release(albumlist)
It works for the first 7 and then returns nothing. Without the if statement, it returns that the index is out of range. I have tested the 8th element by hard coding in its album name and it returns the correct result, this is the same for the next 4 in the list so it isn't an issue with the searching of these album names. I have played around with the lists but I am not entirely sure what is out of range of what.
Any help is greatly appreciated
The 8th row album name has single quotes in its name (Don't Stop Me Eatin'). I tried to remove the quotes and it worked. Maybe you should check what characters are allowed in the query parameters.
def find_album_release(name):
album_ids = []
for x in name:
x = x.replace("'", "") # Remove the quotes from album name
results = sp.search(q="album:" + x, type="album")
....
....
final = pd.read_csv('FinalData.csv')
albumlist = final['album']
finalalbums = find_album_release(albumlist)
The output for me:
spotify:album:31lHUoHC3P6BRFzKYLyRJO
spotify:album:6s84u2TUpR3wdUv4NgKA2j
spotify:album:4OyzQQJHEfKXRfyN4QyLR7
spotify:album:2Hjcfw8zHN4dJDZJGOzLd6
spotify:album:1zEBi4O4AaY5M55dUcUp3z
spotify:album:0Hi8bTOS35xZM0zZ6S89hT
spotify:album:5GGIgiGtxIgcVJQnsKQW94
spotify:album:3rLjiJI34bHFNIFqeK3y9s
spotify:album:6q1MiYTIE28nFzjkvLLt0I
spotify:album:61ulfFSmmxMhc2wCdmdMkN
spotify:album:3euz4vS7ezKGnNSwgyvKcd
spotify:album:1pFaEu56zqpzSviJc3htZN
spotify:album:4PTxbJPI4jj0Kx8hwr1v0T
spotify:album:2ogiazbrNEx0kQHGl5ZBTQ
spotify:album:5glfCPECXSHzidU6exW8wO
spotify:album:1XMw3pBrYeXzNXZXc84DNw
spotify:album:623PL2MBg50Br5dLXC9E9e
spotify:album:4TqgXMSSTwP3RCo3MMSR6t
spotify:album:3xIwVbGJuAcovYIhzbLO3J
spotify:album:3h2xv1tJgDnJZGy5Roxs5A
spotify:album:66xP0vUo8to8ALVpkyKc41
spotify:album:6XcYTEonLIpg9NpAbJnqrC
spotify:album:5sXSHscDjBez8VF20cSyad
spotify:album:6pQZPa398NswBXGYyqHH7y
spotify:album:0488X5veBK6t3vSmIiTDJY

Pyomo optimization Investments/Revenue

I'm new to Pyomo and I'm trying to optimise investments depending on budgets.
I have a total budget, and I want to find the best way to split the budget on the different medias.
eg: total_budget = 5000 --> tv = 3000, cinema = 500, radio = 1500.
I'm struggling "connecting" a Budget with a corresponding Revenue.
The medias have different return curves (It might be better to invest in a specific media until a certain budget is reached, then other medias).
The revenue for the different media is returned by a function like the following: tv_1k_revenue = calculate_revenue(budget=1000, media="tv")
Let say the only constraint I have is the total budget to simplify the problem (I can manage other constraints I think).
Here is my code so far:
model = pyo.ConcreteModel(doc="Optimization model")
# Declaration of possible budgets
model.S1 = Set(initialize=[*df.TV_Budget.values])
model.tv_budget = Var(model.S1, initialize=0.0)
model.S2 = Set(initialize=[*df.Cinema_Budget.values])
model.cinema_budget = Var(model.S2, initialize=0.0)
model.S3 = Set(initialize=[*df.Radio_Budget.values])
model.radio_budget = Var(model.S3, initialize=0.0)
# Objective function
def func_objective(model):
objective_expr = sum(model.tv_revenue +
model.cinema_revenue +
model.radio_revenue)
return objective_expr
model.objective = pyo.Objective(rule=func_objective, sense=pyo.maximize)
So my problem is, how do I declare model.tv_revenue, model.cinema_revenue, model.radio_revenue so I can optimise TV, Cinema and Radio budgets to maximize the total revenue generated by TV, Cinema, Radio?
Right now I created a DataFrame with a Budget and Revenue column for each media, but the best way should be using my calculate_revenue function and set bounds=(min_budget, max_budget) on each media budget.
Thank you for your help!
Thank you very much #AirSquid !
That's exactly it.
It does make a lot of sens to throw pandas in my case.
Also, Yes my revenue function is non-linear.
I might try to make a linear approximation and see if I can make that work.
I was going to try to declare my objective function as:
def func_objective(model):
objective_expr = sum([calculate_revenue(model.budget[media], media=media) for media in model.medias])
return objective_expr
model.objective = pyo.Objective(rule=func_objective, sense=pyo.maximize)
Would you know why I cannot declare it like this?
From what you are providing and your limited experience w/ pyomo, here's my recommendations...
You appear to have budgets and revenues, and those appear to be indexed by media type. It isn't clear what you are doing now with the indexing. So I would expect something like:
model.medias = pyo.Set(initialize=['radio', 'tv', ... ])
model.budget = pyo.Var(model.medias, domain=pyo.NonNegativeReals)
...
Throw pandas out the window. It is a great pkg, but not that helpful in setting up a model. Try something with just python dictionaries to hold your constants & parameters. (see some of my other examples if that is confusing).
The problem you will get to eventually, I'm betting, is that your revenue function is probably non-linear. Right? I would start with a simple linear approximation of it, see if you can get that model working, and then consider either making a piece-wise linear approximation or using a non-linear solver of some kind.
===================
Edit / Additional Info.
Regarding the obj function, you cannot just stuff in a reference to a non-linear function that returns a value. The objective needs to be a valid pyomo expression (linear or non-linear), comprised of model elements. I would start w/ something like this...
# media mix
import pyomo.environ as pyo
# data for linear approximations of form revenue = c1 * budget + c0
# media c0 c1
consts = { 'radio' : (4, 0.6),
'tv' : (12, 0.45)}
# a bunch of other parameters....?? limits, minimums, etc.
### MODEL
m = pyo.ConcreteModel('media mix')
### SETS
m.medias = pyo.Set(initialize=consts.keys())
### VARIABLES
m.budget = pyo.Var(m.medias, domain=pyo.NonNegativeReals)
### OBJ
m.obj = pyo.Objective(expr=sum(consts[media][1]*m.budget[media] + consts[media][0] for media in m.medias),
sense=pyo.maximize)
m.pprint()
Yields:
1 Set Declarations
medias : Size=1, Index=None, Ordered=Insertion
Key : Dimen : Domain : Size : Members
None : 1 : Any : 2 : {'radio', 'tv'}
1 Var Declarations
budget : Size=2, Index=medias
Key : Lower : Value : Upper : Fixed : Stale : Domain
radio : 0 : None : None : False : True : NonNegativeReals
tv : 0 : None : None : False : True : NonNegativeReals
1 Objective Declarations
obj : Size=1, Index=None, Active=True
Key : Active : Sense : Expression
None : True : maximize : 0.6*budget[radio] + 4 + 0.45*budget[tv] + 12
3 Declarations: medias budget obj

How to find / fix attribute error on PSSE?

I am receiving an error when trying to find total load and generation in an area. I keep getting an attribute error. WHere can i find specific attributes for the psspy.ardat code. For the load attribute, .real is correct but for generation attribute, .complex is incorrect.
I keep getting this error:
AttributeError: 'complex' object has no attribute 'complex'
[ierr, sysload_N] = psspy.ardat(1, 'LOAD')
[ierr, sysload_D] = psspy.ardat(2, 'LOAD')
[ierr, sysload_M] = psspy.ardat(3, 'LOAD')
[ierr, sysgen_NI] = psspy.ardat(1, 'GEN')
[ierr, sysgen_DI] = psspy.ardat(2, 'GEN')
[ierr, sysgen_MI] = psspy.ardat(3, 'GEN')
sysload_TOT = sysload_N.real + sysload_D.real+sysload_M.real
output = 'Total Load iS #: {} MW\t'
formatted = output.format(sysload_TOT)
sysgen_TOT = sysgen_NI.complex + sysgen_DI.real+sysgen_MI.complex
output2 = 'Total Generation iS #: {} MW\t'
formatted2 = output2.format(sysgen_TOT)
sysLG_TOT=(sysload_TOT-sysgen_TOT)/(sysload_TOT)*100
output3 = 'Total Imports iS #: {}%\t'
formatted3 = output3.format(sysLG_TOT)
output.append(formatted)
output2.append(formatted2)
output3.append(formatted3)
print(output)
print(output2)
print(output3)
The function psspy.ardat() returns [ierr, cmpval] where ierr is an integer object and cmpval is a complex object as described by the docstring below and repeated in the API documentation:
"""
Use this API to return area totals.
Python syntax:
ierr, cmpval = ardat(iar, string)
where:
Integer IAR Area number (input).
Character STRING String indicating the area total desired (input).
= LOAD, Total owner load by bus owner assignment (net of load plus in-service distributed
generation on load feeder).
= LOADLD, Total owner load by load owner assignment.
= LDGN, Total distributed generation on load feeder by bus owner assignment.
= LDGNLD, Total distributed generation on load feeder by load owner assignment.
= GEN, Total owner generation.
= LOSS, Total owner losses.
= INT, Net area interchange.
= INDMAC, Total owner induction machine powers by bus owner assignment.
= INDMACMC, Total owner induction machine powers by machine owner assignment.
= INDGEN, Total owner induction generator powers by bus owner assignment.
= INDGENMC, Total owner induction generator powers by machine owner assignment.
= INDMOT, Total owner induction motor powers by bus owner assignment.
= INDMOTMC, Total owner induction motor powers by machine owner assignment.
Complex CMPVAL Desired complex power (output).
Integer IERR Error code (output).
= 0, No error; 'P' and 'Q' or 'CMPVAL' returned.
= 1, Area number < 0 or > largest allowable area number; 'P' and 'Q' or 'CMPVAL' unchanged.
= 2, No in-service buses with in-service loads (for 'LOAD'), no in-service loads (for
'LOADLD'), no type 2 or type 3 buses (for 'GEN'), no branches (for 'LOSS'), or no ties
(for 'INT') in area; 'P' and 'Q' or 'CMPVAL' unchanged.
= 3, Area not found; 'P' and 'Q' or 'CMPVAL' unchanged.
= 4, Bad 'STRING' value; 'P' and 'Q' or 'CMPVAL' unchanged.
"""
A complex object is defined in the Python standard library and has attributes .real and .imag but does not have a .complex attribute; this is why you are getting the AttributeError. Try the following:
sysload_TOT = sysload_N.real + sysload_D.real + sysload_M.real
output1 = 'Total Load iS #: {} MW\t'
formatted = output.format(sysload_TOT)
sysgen_TOT = sysgen_NI.real + sysgen_DI.real + sysgen_MI.real
output2 = 'Total Generation iS #: {} MW\t'
formatted2 = output2.format(sysgen_TOT)
sysLG_TOT = 100 * (sysload_TOT - sysgen_TOT) / (sysload_TOT)
output3 = 'Total Imports is #: {}%\t'
formatted3 = output3.format(sysLG_TOT)
print(formatted)
print(formatted2)
print(formatted3)
If you are going to be performing this functionality often I would recommend the following approach:
# create a subsystem which contains the areas of interest
psspy.asys(sid=0, num=3, areas=[1,2,3])
# return an array of real values for subsystem areas
ierr, rarray = psspy.aareareal(sid=0, string=['PLOAD', 'PGEN', 'PINT'])
Whenever you are using psspy functions referring to the documentation is very important.

Ask for clues for the “It's All About the Miles”

I recently encounter a problem and I really cannot figure out how to solve it. It's a problem in Open Kattis.
Please visit https://uchicago.kattis.com/problems/uchicago.miles
By now, I know it's a recursion problem.
But how to define this recursive procedure? I don't know where should I start.
So please give me a clue or maybe some pseudocode.
Here I pasted my code for reading the input, and I convert the input data into dictionary.
AFItt = input().split()
A, F, I = map(int, AFItt[0:3])
tmin, tmax = map(float, AFItt[3:])
airport = []
ada ={}
ai= []
for _ in range(A):
airport.append(input())
for _ in range(F):
ffda = input().split()
if ffda[0] + " " + ffda[1] not in ada.keys():
ada[ffda[0] + " " + ffda[1]] = (float(ffda[2]), float(ffda[3]))
else:
ada[ffda[0] + " " + ffda[1]] += ((float(ffda[2]), float(ffda[3])))
for _ in range(I):
ai.append(input())
I will try to give you a clue, but not sure whether it is efficient enough. I wrote a javascript version and it can produce the sample outputs correctly.
The idea of my solution is very simple: from the starting of the itinerary, find all possible next flights and keep appending to previous flight runs.
For example,
for first 2 itinerary airports, I will find all the possible flights and save it in an array list [[fligh1], [flight2], [flight3]]
after that, I will loop all the current possible runs, and continue to check if there existed an flight for the possible run to continue. If not, it is excluded, if yes, we append the flight to the list.
If flight1 and flight2 cannot continue, but flight3 has two possible flights to continue, my flight list will be changed to [[flight3, flight4], [flight3, flight5]]
A bit hard to me to explain well. Following is some code skeleton:
function findAllFlights(flightMap,
currentFlights,
currentItineraryIndex,
itineraryList, minTime, maxTime){
//flightMap is a map of all the flights. sample data:
/*{'a->b':[{from: 'a', to:'b', depTime:'1', arrTime:'2'}, {another flight}, ... ],
'b->c': [{from: 'b', to:'c', depTime:'1', arrTime:'2'}, {another flight}, ... ]}
*/
//currentFlights is the result of current possible runs, it is a list of list of flights. each sub list means a possible run.
//[[flight1, flight2], [flight1, flight3], ...]
//currentItineraryIndex: this is the next airport index in the itineraryList
//itineraryList: this is the list of airports we should travel.
//minTime, maxTime: it is the min time and max time.
if(currentItineraryIndex == 0){
var from = itineraryList[0];
var to = itineraryList[1];
var flightMapKey = from+'->'+to;
var possibleFlights = flightMap[flightMapKey];
if(possibleFlights.length == 0){
return [];
}
for(var i=0; i<possibleFlights.length; i++){
//current flights should be a list of list of flights.
//each of the sub list denotes the journey currently.
currentFlights.push([possibleFlights[i]]);
}
return findAllFlights(flightMap, currentFlights, 1, itineraryList, minTime, maxTime);
}else if(currentItineraryIndex == itineraryList.length - 1){
//we have searched all the required airports
return currentFlights;
}else{
//this is where you need to recursively call findAllFlights method.
var continableFlights = [];
//TODO: try to produce the continuable list of flights here based on above explanation.
//once we have the continuable flights for current itinerary airport, we can find flights for next airport similarly.
return findAllFlights(flightMap, continableFlights, currentItineraryIndex + 1, itineraryList, minTime, maxTime);
}
}
Enjoy!

Categories

Resources